Can we replace curation with information extraction software?
Karp, Peter D
2016-01-01
Can we use programs for automated or semi-automated information extraction from scientific texts as practical alternatives to professional curation? I show that error rates of current information extraction programs are too high to replace professional curation today. Furthermore, current IEP programs extract single narrow slivers of information, such as individual protein interactions; they cannot extract the large breadth of information extracted by professional curators for databases such as EcoCyc. They also cannot arbitrate among conflicting statements in the literature as curators can. Therefore, funding agencies should not hobble the curation efforts of existing databases on the assumption that a problem that has stymied Artificial Intelligence researchers for more than 60 years will be solved tomorrow. Semi-automated extraction techniques appear to have significantly more potential based on a review of recent tools that enhance curator productivity. But a full cost-benefit analysis for these tools is lacking. Without such analysis it is possible to expend significant effort developing information-extraction tools that automate small parts of the overall curation workflow without achieving a significant decrease in curation costs.Database URL. © The Author(s) 2016. Published by Oxford University Press.
Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.
Dave, Jaydev K; Gingold, Eric L
2013-01-01
The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.
Information Fusion for Feature Extraction and the Development of Geospatial Information
2004-07-01
of automated processing . 2. Requirements for Geospatial Information Accurate, timely geospatial information is critical for many military...this evaluation illustrates some of the difficulties in comparing manual and automated processing results (figure 5). The automated delineation of
Optimization-based method for automated road network extraction
DOT National Transportation Integrated Search
2001-09-18
Automated road information extraction has significant applicability in transportation. : It provides a means for creating, maintaining, and updating transportation network databases that : are needed for purposes ranging from traffic management to au...
Automated extraction of radiation dose information for CT examinations.
Cook, Tessa S; Zimmerman, Stefan; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2010-11-01
Exposure to radiation as a result of medical imaging is currently in the spotlight, receiving attention from Congress as well as the lay press. Although scanner manufacturers are moving toward including effective dose information in the Digital Imaging and Communications in Medicine headers of imaging studies, there is a vast repository of retrospective CT data at every imaging center that stores dose information in an image-based dose sheet. As such, it is difficult for imaging centers to participate in the ACR's Dose Index Registry. The authors have designed an automated extraction system to query their PACS archive and parse CT examinations to extract the dose information stored in each dose sheet. First, an open-source optical character recognition program processes each dose sheet and converts the information to American Standard Code for Information Interchange (ASCII) text. Each text file is parsed, and radiation dose information is extracted and stored in a database which can be queried using an existing pathology and radiology enterprise search tool. Using this automated extraction pipeline, it is possible to perform dose analysis on the >800,000 CT examinations in the PACS archive and generate dose reports for all of these patients. It is also possible to more effectively educate technologists, radiologists, and referring physicians about exposure to radiation from CT by generating report cards for interpreted and performed studies. The automated extraction pipeline enables compliance with the ACR's reporting guidelines and greater awareness of radiation dose to patients, thus resulting in improved patient care and management. Copyright © 2010 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Evans, D. A.; Brownlow, N. D.; Hersh, W. R.; Campbell, E. M.
1996-01-01
We discuss the development and evaluation of an automated procedure for extracting drug-dosage information from clinical narratives. The process was developed rapidly using existing technology and resources, including categories of terms from UMLS96. Evaluations over a large training and smaller test set of medical records demonstrate an approximately 80% rate of exact and partial matches' on target phrases, with few false positives and a modest rate of false negatives. The results suggest a strategy for automating general concept identification in electronic medical records. PMID:8947694
Jagannathan, V; Mullett, Charles J; Arbogast, James G; Halbritter, Kevin A; Yellapragada, Deepthi; Regulapati, Sushmitha; Bandaru, Pavani
2009-04-01
We assessed the current state of commercial natural language processing (NLP) engines for their ability to extract medication information from textual clinical documents. Two thousand de-identified discharge summaries and family practice notes were submitted to four commercial NLP engines with the request to extract all medication information. The four sets of returned results were combined to create a comparison standard which was validated against a manual, physician-derived gold standard created from a subset of 100 reports. Once validated, the individual vendor results for medication names, strengths, route, and frequency were compared against this automated standard with precision, recall, and F measures calculated. Compared with the manual, physician-derived gold standard, the automated standard was successful at accurately capturing medication names (F measure=93.2%), but performed less well with strength (85.3%) and route (80.3%), and relatively poorly with dosing frequency (48.3%). Moderate variability was seen in the strengths of the four vendors. The vendors performed better with the structured discharge summaries than with the clinic notes in an analysis comparing the two document types. Although automated extraction may serve as the foundation for a manual review process, it is not ready to automate medication lists without human intervention.
Two-dimensional thermal video analysis of offshore bird and bat flight
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
2015-09-11
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Two-dimensional thermal video analysis of offshore bird and bat flight
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Using Process Redesign and Information Technology to Improve Procurement
1994-04-01
contrac- tor. Many large-volume contractors have automated order processing tied to ac- counting, manufacturing, and shipping subsystems. Currently...the contractor must receive the mailed order, analyze it, extract pertinent information, and en- ter that information into the automated order ... processing system. Almost all orders for small purchases are unilateral documents that do not require acceptance or acknowledgment by the contractor. For
Automatic information extraction from unstructured mammography reports using distributed semantics.
Gupta, Anupama; Banerjee, Imon; Rubin, Daniel L
2018-02-01
To date, the methods developed for automated extraction of information from radiology reports are mainly rule-based or dictionary-based, and, therefore, require substantial manual effort to build these systems. Recent efforts to develop automated systems for entity detection have been undertaken, but little work has been done to automatically extract relations and their associated named entities in narrative radiology reports that have comparable accuracy to rule-based methods. Our goal is to extract relations in a unsupervised way from radiology reports without specifying prior domain knowledge. We propose a hybrid approach for information extraction that combines dependency-based parse tree with distributed semantics for generating structured information frames about particular findings/abnormalities from the free-text mammography reports. The proposed IE system obtains a F 1 -score of 0.94 in terms of completeness of the content in the information frames, which outperforms a state-of-the-art rule-based system in this domain by a significant margin. The proposed system can be leveraged in a variety of applications, such as decision support and information retrieval, and may also easily scale to other radiology domains, since there is no need to tune the system with hand-crafted information extraction rules. Copyright © 2018 Elsevier Inc. All rights reserved.
Modelling and representation issues in automated feature extraction from aerial and satellite images
NASA Astrophysics Data System (ADS)
Sowmya, Arcot; Trinder, John
New digital systems for the processing of photogrammetric and remote sensing images have led to new approaches to information extraction for mapping and Geographic Information System (GIS) applications, with the expectation that data can become more readily available at a lower cost and with greater currency. Demands for mapping and GIS data are increasing as well for environmental assessment and monitoring. Hence, researchers from the fields of photogrammetry and remote sensing, as well as computer vision and artificial intelligence, are bringing together their particular skills for automating these tasks of information extraction. The paper will review some of the approaches used in knowledge representation and modelling for machine vision, and give examples of their applications in research for image understanding of aerial and satellite imagery.
Automated extraction of family history information from clinical notes.
Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S; Winden, Tamara J; Carter, Elizabeth W; Melton, Genevieve B
2014-01-01
Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication ("indicator phrases"), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications.
Automated Extraction of Family History Information from Clinical Notes
Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S.; Winden, Tamara J.; Carter, Elizabeth W.; Melton, Genevieve B.
2014-01-01
Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication (“indicator phrases”), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications. PMID:25954443
First Steps to Automated Interior Reconstruction from Semantically Enriched Point Clouds and Imagery
NASA Astrophysics Data System (ADS)
Obrock, L. S.; Gülch, E.
2018-05-01
The automated generation of a BIM-Model from sensor data is a huge challenge for the modeling of existing buildings. Currently the measurements and analyses are time consuming, allow little automation and require expensive equipment. We do lack an automated acquisition of semantical information of objects in a building. We are presenting first results of our approach based on imagery and derived products aiming at a more automated modeling of interior for a BIM building model. We examine the building parts and objects visible in the collected images using Deep Learning Methods based on Convolutional Neural Networks. For localization and classification of building parts we apply the FCN8s-Model for pixel-wise Semantic Segmentation. We, so far, reach a Pixel Accuracy of 77.2 % and a mean Intersection over Union of 44.2 %. We finally use the network for further reasoning on the images of the interior room. We combine the segmented images with the original images and use photogrammetric methods to produce a three-dimensional point cloud. We code the extracted object types as colours of the 3D-points. We thus are able to uniquely classify the points in three-dimensional space. We preliminary investigate a simple extraction method for colour and material of building parts. It is shown, that the combined images are very well suited to further extract more semantic information for the BIM-Model. With the presented methods we see a sound basis for further automation of acquisition and modeling of semantic and geometric information of interior rooms for a BIM-Model.
NASA Astrophysics Data System (ADS)
Wang, Ke; Guo, Ping; Luo, A.-Li
2017-03-01
Spectral feature extraction is a crucial procedure in automated spectral analysis. This procedure starts from the spectral data and produces informative and non-redundant features, facilitating the subsequent automated processing and analysis with machine-learning and data-mining techniques. In this paper, we present a new automated feature extraction method for astronomical spectra, with application in spectral classification and defective spectra recovery. The basic idea of our approach is to train a deep neural network to extract features of spectra with different levels of abstraction in different layers. The deep neural network is trained with a fast layer-wise learning algorithm in an analytical way without any iterative optimization procedure. We evaluate the performance of the proposed scheme on real-world spectral data. The results demonstrate that our method is superior regarding its comprehensive performance, and the computational cost is significantly lower than that for other methods. The proposed method can be regarded as a new valid alternative general-purpose feature extraction method for various tasks in spectral data analysis.
NASA Astrophysics Data System (ADS)
Dogon-yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.
2016-10-01
Timely and accurate acquisition of information on the condition and structural changes of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting tree features include; ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraint, such as labour intensive field work, a lot of financial requirement, influences by weather condition and topographical covers which can be overcome by means of integrated airborne based LiDAR and very high resolution digital image datasets. This study presented a semi-automated approach for extracting urban trees from integrated airborne based LIDAR and multispectral digital image datasets over Istanbul city of Turkey. The above scheme includes detection and extraction of shadow free vegetation features based on spectral properties of digital images using shadow index and NDVI techniques and automated extraction of 3D information about vegetation features from the integrated processing of shadow free vegetation image and LiDAR point cloud datasets. The ability of the developed algorithms shows a promising result as an automated and cost effective approach to estimating and delineated 3D information of urban trees. The research also proved that integrated datasets is a suitable technology and a viable source of information for city managers to be used in urban trees management.
Extracting Information from Narratives: An Application to Aviation Safety Reports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Posse, Christian; Matzke, Brett D.; Anderson, Catherine M.
2005-05-12
Aviation safety reports are the best available source of information about why a flight incident happened. However, stream of consciousness permeates the narratives making difficult the automation of the information extraction task. We propose an approach and infrastructure based on a common pattern specification language to capture relevant information via normalized template expression matching in context. Template expression matching handles variants of multi-word expressions. Normalization improves the likelihood of correct hits by standardizing and cleaning the vocabulary used in narratives. Checking for the presence of negative modifiers in the proximity of a potential hit reduces the chance of false hits.more » We present the above approach in the context of a specific application, which is the extraction of human performance factors from NASA ASRS reports. While knowledge infusion from experts plays a critical role during the learning phase, early results show that in a production mode, the automated process provides information that is consistent with analyses by human subjects.« less
An Automated High-Throughput System to Fractionate Plant Natural Products for Drug Discovery
Tu, Ying; Jeffries, Cynthia; Ruan, Hong; Nelson, Cynthia; Smithson, David; Shelat, Anang A.; Brown, Kristin M.; Li, Xing-Cong; Hester, John P.; Smillie, Troy; Khan, Ikhlas A.; Walker, Larry; Guy, Kip; Yan, Bing
2010-01-01
The development of an automated, high-throughput fractionation procedure to prepare and analyze natural product libraries for drug discovery screening is described. Natural products obtained from plant materials worldwide were extracted and first prefractionated on polyamide solid-phase extraction cartridges to remove polyphenols, followed by high-throughput automated fractionation, drying, weighing, and reformatting for screening and storage. The analysis of fractions with UPLC coupled with MS, PDA and ELSD detectors provides information that facilitates characterization of compounds in active fractions. Screening of a portion of fractions yielded multiple assay-specific hits in several high-throughput cellular screening assays. This procedure modernizes the traditional natural product fractionation paradigm by seamlessly integrating automation, informatics, and multimodal analytical interrogation capabilities. PMID:20232897
Building an automated SOAP classifier for emergency department reports.
Mowery, Danielle; Wiebe, Janyce; Visweswaran, Shyam; Harkema, Henk; Chapman, Wendy W
2012-02-01
Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework's usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen's kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks. Copyright © 2011. Published by Elsevier Inc.
CMS-2 Reverse Engineering and ENCORE/MODEL Integration
1992-05-01
Automated extraction of design information from an existing software system written in CMS-2 can be used to document that system as-built, and that I The...extracted information is provided by a commer- dally available CASE tool. * Information describing software system design is automatically extracted...the displays in Figures 1, 2, and 3. T achiev ths GE 11 b iuo w as rjcs CM-2t Aa nsltr(M2da 1 n Joia Reverse EwngiernTcnlg 5RT [2GRE] . Two xampe fD
MARS: bringing the automation of small-molecule bioanalytical sample preparations to a new frontier.
Li, Ming; Chou, Judy; Jing, Jing; Xu, Hui; Costa, Aldo; Caputo, Robin; Mikkilineni, Rajesh; Flannelly-King, Shane; Rohde, Ellen; Gan, Lawrence; Klunk, Lewis; Yang, Liyu
2012-06-01
In recent years, there has been a growing interest in automating small-molecule bioanalytical sample preparations specifically using the Hamilton MicroLab(®) STAR liquid-handling platform. In the most extensive work reported thus far, multiple small-molecule sample preparation assay types (protein precipitation extraction, SPE and liquid-liquid extraction) have been integrated into a suite that is composed of graphical user interfaces and Hamilton scripts. Using that suite, bioanalytical scientists have been able to automate various sample preparation methods to a great extent. However, there are still areas that could benefit from further automation, specifically, the full integration of analytical standard and QC sample preparation with study sample extraction in one continuous run, real-time 2D barcode scanning on the Hamilton deck and direct Laboratory Information Management System database connectivity. We developed a new small-molecule sample-preparation automation system that improves in all of the aforementioned areas. The improved system presented herein further streamlines the bioanalytical workflow, simplifies batch run design, reduces analyst intervention and eliminates sample-handling error.
Automation for System Safety Analysis
NASA Technical Reports Server (NTRS)
Malin, Jane T.; Fleming, Land; Throop, David; Thronesbery, Carroll; Flores, Joshua; Bennett, Ted; Wennberg, Paul
2009-01-01
This presentation describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.
Quantitative Indicators for Behaviour Drift Detection from Home Automation Data.
Veronese, Fabio; Masciadri, Andrea; Comai, Sara; Matteucci, Matteo; Salice, Fabio
2017-01-01
Smart Homes diffusion provides an opportunity to implement elderly monitoring, extending seniors' independence and avoiding unnecessary assistance costs. Information concerning the inhabitant behaviour is contained in home automation data, and can be extracted by means of quantitative indicators. The application of such approach proves it can evidence behaviour changes.
Information Graphic Classification, Decomposition and Alternative Representation
ERIC Educational Resources Information Center
Gao, Jinglun
2012-01-01
This thesis work is mainly focused on two problems related to improving accessibility of information graphics for visually impaired users. The first problem is automated analysis of information graphics for information extraction and the second problem is multi-modal representations for accessibility. Information graphics are graphical…
Automated Image Registration Using Morphological Region of Interest Feature Extraction
NASA Technical Reports Server (NTRS)
Plaza, Antonio; LeMoigne, Jacqueline; Netanyahu, Nathan S.
2005-01-01
With the recent explosion in the amount of remotely sensed imagery and the corresponding interest in temporal change detection and modeling, image registration has become increasingly important as a necessary first step in the integration of multi-temporal and multi-sensor data for applications such as the analysis of seasonal and annual global climate changes, as well as land use/cover changes. The task of image registration can be divided into two major components: (1) the extraction of control points or features from images; and (2) the search among the extracted features for the matching pairs that represent the same feature in the images to be matched. Manual control feature extraction can be subjective and extremely time consuming, and often results in few usable points. Automated feature extraction is a solution to this problem, where desired target features are invariant, and represent evenly distributed landmarks such as edges, corners and line intersections. In this paper, we develop a novel automated registration approach based on the following steps. First, a mathematical morphology (MM)-based method is used to obtain a scale-orientation morphological profile at each image pixel. Next, a spectral dissimilarity metric such as the spectral information divergence is applied for automated extraction of landmark chips, followed by an initial approximate matching. This initial condition is then refined using a hierarchical robust feature matching (RFM) procedure. Experimental results reveal that the proposed registration technique offers a robust solution in the presence of seasonal changes and other interfering factors. Keywords-Automated image registration, multi-temporal imagery, mathematical morphology, robust feature matching.
Bhattacharya, Pratik; Van Stavern, Renee; Madhavan, Ramesh
2010-12-01
Use of resident case logs has been considered by the Residency Review Committee for Neurology of the Accreditation Council for Graduate Medical Education (ACGME). This study explores the effectiveness of a data-mining program for creating resident logs and compares the results to a manual data-entry system. Other potential applications of data mining to enhancing resident education are also explored. Patient notes dictated by residents were extracted from the Hospital Information System and analyzed using an unstructured mining program. History, examination and ICD codes were obtained and compared to the existing manual log. The automated data History, examination, and ICD codes were gathered for a 30-day period and compared to manual case logs. The automated method extracted all resident dictations with the dates of encounter and transcription. The automated data-miner processed information from all 19 residents, while only 4 residents logged manually. The manual method identified only broad categories of diseases; the major categories were stroke or vascular disorder 53 (27.6%), epilepsy 28 (14.7%), and pain syndromes 26 (13.5%). In the automated method, epilepsy 114 (21.1%), cerebral atherosclerosis 114 (21.1%), and headache 105 (19.4%) were the most frequent primary diagnoses, and headache 89 (16.5%), seizures 94 (17.4%), and low back pain 47 (9%) were the most common chief complaints. More detailed patient information such as tobacco use 227 (42%), alcohol use 205 (38%), and drug use 38 (7%) were extracted by the data-mining method. Manual case logs are time-consuming, provide limited information, and may be unpopular with residents. Data mining is a time-effective tool that may aid in the assessment of resident experience or the ACGME core competencies or in resident clinical research. More study of this method in larger numbers of residency programs is needed.
An information extraction framework for cohort identification using electronic health records.
Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G
2013-01-01
Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.
Oldham, Athenia L; Drilling, Heather S; Stamps, Blake W; Stevenson, Bradley S; Duncan, Kathleen E
2012-11-20
The analysis of microbial assemblages in industrial, marine, and medical systems can inform decisions regarding quality control or mitigation. Modern molecular approaches to detect, characterize, and quantify microorganisms provide rapid and thorough measures unbiased by the need for cultivation. The requirement of timely extraction of high quality nucleic acids for molecular analysis is faced with specific challenges when used to study the influence of microorganisms on oil production. Production facilities are often ill equipped for nucleic acid extraction techniques, making the preservation and transportation of samples off-site a priority. As a potential solution, the possibility of extracting nucleic acids on-site using automated platforms was tested. The performance of two such platforms, the Fujifilm QuickGene-Mini80™ and Promega Maxwell®16 was compared to a widely used manual extraction kit, MOBIO PowerBiofilm™ DNA Isolation Kit, in terms of ease of operation, DNA quality, and microbial community composition. Three pipeline biofilm samples were chosen for these comparisons; two contained crude oil and corrosion products and the third transported seawater. Overall, the two more automated extraction platforms produced higher DNA yields than the manual approach. DNA quality was evaluated for amplification by quantitative PCR (qPCR) and end-point PCR to generate 454 pyrosequencing libraries for 16S rRNA microbial community analysis. Microbial community structure, as assessed by DGGE analysis and pyrosequencing, was comparable among the three extraction methods. Therefore, the use of automated extraction platforms should enhance the feasibility of rapidly evaluating microbial biofouling at remote locations or those with limited resources.
2012-01-01
The analysis of microbial assemblages in industrial, marine, and medical systems can inform decisions regarding quality control or mitigation. Modern molecular approaches to detect, characterize, and quantify microorganisms provide rapid and thorough measures unbiased by the need for cultivation. The requirement of timely extraction of high quality nucleic acids for molecular analysis is faced with specific challenges when used to study the influence of microorganisms on oil production. Production facilities are often ill equipped for nucleic acid extraction techniques, making the preservation and transportation of samples off-site a priority. As a potential solution, the possibility of extracting nucleic acids on-site using automated platforms was tested. The performance of two such platforms, the Fujifilm QuickGene-Mini80™ and Promega Maxwell®16 was compared to a widely used manual extraction kit, MOBIO PowerBiofilm™ DNA Isolation Kit, in terms of ease of operation, DNA quality, and microbial community composition. Three pipeline biofilm samples were chosen for these comparisons; two contained crude oil and corrosion products and the third transported seawater. Overall, the two more automated extraction platforms produced higher DNA yields than the manual approach. DNA quality was evaluated for amplification by quantitative PCR (qPCR) and end-point PCR to generate 454 pyrosequencing libraries for 16S rRNA microbial community analysis. Microbial community structure, as assessed by DGGE analysis and pyrosequencing, was comparable among the three extraction methods. Therefore, the use of automated extraction platforms should enhance the feasibility of rapidly evaluating microbial biofouling at remote locations or those with limited resources. PMID:23168231
Schulze, H Georg; Turner, Robin F B
2015-06-01
High-throughput information extraction from large numbers of Raman spectra is becoming an increasingly taxing problem due to the proliferation of new applications enabled using advances in instrumentation. Fortunately, in many of these applications, the entire process can be automated, yielding reproducibly good results with significant time and cost savings. Information extraction consists of two stages, preprocessing and analysis. We focus here on the preprocessing stage, which typically involves several steps, such as calibration, background subtraction, baseline flattening, artifact removal, smoothing, and so on, before the resulting spectra can be further analyzed. Because the results of some of these steps can affect the performance of subsequent ones, attention must be given to the sequencing of steps, the compatibility of these sequences, and the propensity of each step to generate spectral distortions. We outline here important considerations to effect full automation of Raman spectral preprocessing: what is considered full automation; putative general principles to effect full automation; the proper sequencing of processing and analysis steps; conflicts and circularities arising from sequencing; and the need for, and approaches to, preprocessing quality control. These considerations are discussed and illustrated with biological and biomedical examples reflecting both successful and faulty preprocessing.
Karayiannis, Nicolaos B; Sami, Abdul; Frost, James D; Wise, Merrill S; Mizrahi, Eli M
2005-04-01
This paper presents an automated procedure developed to extract quantitative information from video recordings of neonatal seizures in the form of motor activity signals. This procedure relies on optical flow computation to select anatomical sites located on the infants' body parts. Motor activity signals are extracted by tracking selected anatomical sites during the seizure using adaptive block matching. A block of pixels is tracked throughout a sequence of frames by searching for the most similar block of pixels in subsequent frames; this search is facilitated by employing various update strategies to account for the changing appearance of the block. The proposed procedure is used to extract temporal motor activity signals from video recordings of neonatal seizures and other events not associated with seizures.
Automated classification of optical coherence tomography images of human atrial tissue
NASA Astrophysics Data System (ADS)
Gan, Yu; Tsay, David; Amir, Syed B.; Marboe, Charles C.; Hendon, Christine P.
2016-10-01
Tissue composition of the atria plays a critical role in the pathology of cardiovascular disease, tissue remodeling, and arrhythmogenic substrates. Optical coherence tomography (OCT) has the ability to capture the tissue composition information of the human atria. In this study, we developed a region-based automated method to classify tissue compositions within human atria samples within OCT images. We segmented regional information without prior information about the tissue architecture and subsequently extracted features within each segmented region. A relevance vector machine model was used to perform automated classification. Segmentation of human atrial ex vivo datasets was correlated with trichrome histology and our classification algorithm had an average accuracy of 80.41% for identifying adipose, myocardium, fibrotic myocardium, and collagen tissue compositions.
Longitudinal Analysis of New Information Types in Clinical Notes
Zhang, Rui; Pakhomov, Serguei; Melton, Genevieve B.
2014-01-01
It is increasingly recognized that redundant information in clinical notes within electronic health record (EHR) systems is ubiquitous, significant, and may negatively impact the secondary use of these notes for research and patient care. We investigated several automated methods to identify redundant versus relevant new information in clinical reports. These methods may provide a valuable approach to extract clinically pertinent information and further improve the accuracy of clinical information extraction systems. In this study, we used UMLS semantic types to extract several types of new information, including problems, medications, and laboratory information. Automatically identified new information highly correlated with manual reference standard annotations. Methods to identify different types of new information can potentially help to build up more robust information extraction systems for clinical researchers as well as aid clinicians and researchers in navigating clinical notes more effectively and quickly identify information pertaining to changes in health states. PMID:25717418
Fernandez-Ricaud, Luciano; Kourtchenko, Olga; Zackrisson, Martin; Warringer, Jonas; Blomberg, Anders
2016-06-23
Phenomics is a field in functional genomics that records variation in organismal phenotypes in the genetic, epigenetic or environmental context at a massive scale. For microbes, the key phenotype is the growth in population size because it contains information that is directly linked to fitness. Due to technical innovations and extensive automation our capacity to record complex and dynamic microbial growth data is rapidly outpacing our capacity to dissect and visualize this data and extract the fitness components it contains, hampering progress in all fields of microbiology. To automate visualization, analysis and exploration of complex and highly resolved microbial growth data as well as standardized extraction of the fitness components it contains, we developed the software PRECOG (PREsentation and Characterization Of Growth-data). PRECOG allows the user to quality control, interact with and evaluate microbial growth data with ease, speed and accuracy, also in cases of non-standard growth dynamics. Quality indices filter high- from low-quality growth experiments, reducing false positives. The pre-processing filters in PRECOG are computationally inexpensive and yet functionally comparable to more complex neural network procedures. We provide examples where data calibration, project design and feature extraction methodologies have a clear impact on the estimated growth traits, emphasising the need for proper standardization in data analysis. PRECOG is a tool that streamlines growth data pre-processing, phenotypic trait extraction, visualization, distribution and the creation of vast and informative phenomics databases.
An Information Extraction Framework for Cohort Identification Using Electronic Health Records
Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G
Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255
Wozniak, Aniela; Geoffroy, Enrique; Miranda, Carolina; Castillo, Claudia; Sanhueza, Francia; García, Patricia
2016-11-01
The choice of nucleic acids (NAs) extraction method for molecular diagnosis in microbiology is of major importance because of the low microbial load, different nature of microorganisms, and clinical specimens. The NA yield of different extraction methods has been mostly studied using spiked samples. However, information from real human clinical specimens is scarce. The purpose of this study was to compare the performance of a manual low-cost extraction method (Qiagen kit or salting-out extraction method) with the automated high-cost MagNAPure Compact method. According to cycle threshold values for different pathogens, MagNAPure is as efficient as Qiagen for NA extraction from noncomplex clinical specimens (nasopharyngeal swab, skin swab, plasma, respiratory specimens). In contrast, according to cycle threshold values for RNAseP, MagNAPure method may not be an appropriate method for NA extraction from blood. We believe that MagNAPure versatility reduced risk of cross-contamination and reduced hands-on time compensates its high cost. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Andrew A.; Meng, Frank; Morioka, Craig A.; Churchill, Bernard M.; Kangarloo, Hooshang
2005-04-01
Managing pediatric patients with neurogenic bladder (NGB) involves regular laboratory, imaging, and physiologic testing. Using input from domain experts and current literature, we identified specific data points from these tests to develop the concept of an electronic disease vector for NGB. An information extraction engine was used to extract the desired data elements from free-text and semi-structured documents retrieved from the patient"s medical record. Finally, a Java-based presentation engine created graphical visualizations of the extracted data. After precision, recall, and timing evaluation, we conclude that these tools may enable clinically useful, automatically generated, and diagnosis-specific visualizations of patient data, potentially improving compliance and ultimately, outcomes.
Classification of the Gabon SAR Mosaic Using a Wavelet Based Rule Classifier
NASA Technical Reports Server (NTRS)
Simard, Marc; Saatchi, Sasan; DeGrandi, Gianfranco
2000-01-01
A method is developed for semi-automated classification of SAR images of the tropical forest. Information is extracted using the wavelet transform (WT). The transform allows for extraction of structural information in the image as a function of scale. In order to classify the SAR image, a Desicion Tree Classifier is used. The method of pruning is used to optimize classification rate versus tree size. The results give explicit insight on the type of information useful for a given class.
2015-01-01
Biological assays formatted as microarrays have become a critical tool for the generation of the comprehensive data sets required for systems-level understanding of biological processes. Manual annotation of data extracted from images of microarrays, however, remains a significant bottleneck, particularly for protein microarrays due to the sensitivity of this technology to weak artifact signal. In order to automate the extraction and curation of data from protein microarrays, we describe an algorithm called Crossword that logically combines information from multiple approaches to fully automate microarray segmentation. Automated artifact removal is also accomplished by segregating structured pixels from the background noise using iterative clustering and pixel connectivity. Correlation of the location of structured pixels across image channels is used to identify and remove artifact pixels from the image prior to data extraction. This component improves the accuracy of data sets while reducing the requirement for time-consuming visual inspection of the data. Crossword enables a fully automated protocol that is robust to significant spatial and intensity aberrations. Overall, the average amount of user intervention is reduced by an order of magnitude and the data quality is increased through artifact removal and reduced user variability. The increase in throughput should aid the further implementation of microarray technologies in clinical studies. PMID:24417579
Deep SOMs for automated feature extraction and classification from big data streaming
NASA Astrophysics Data System (ADS)
Sakkari, Mohamed; Ejbali, Ridha; Zaied, Mourad
2017-03-01
In this paper, we proposed a deep self-organizing map model (Deep-SOMs) for automated features extracting and learning from big data streaming which we benefit from the framework Spark for real time streams and highly parallel data processing. The SOMs deep architecture is based on the notion of abstraction (patterns automatically extract from the raw data, from the less to more abstract). The proposed model consists of three hidden self-organizing layers, an input and an output layer. Each layer is made up of a multitude of SOMs, each map only focusing at local headmistress sub-region from the input image. Then, each layer trains the local information to generate more overall information in the higher layer. The proposed Deep-SOMs model is unique in terms of the layers architecture, the SOMs sampling method and learning. During the learning stage we use a set of unsupervised SOMs for feature extraction. We validate the effectiveness of our approach on large data sets such as Leukemia dataset and SRBCT. Results of comparison have shown that the Deep-SOMs model performs better than many existing algorithms for images classification.
Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi
2015-09-01
Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Automated Extraction of Substance Use Information from Clinical Texts.
Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B
2015-01-01
Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.
D'Avolio, Leonard W; Nguyen, Thien M; Goryachev, Sergey; Fiore, Louis D
2011-01-01
Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval. A 'learn by example' approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge's concept extraction task provided the data sets and metrics used to evaluate performance. Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks. Discussion With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation. Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.
Automated road network extraction from high spatial resolution multi-spectral imagery
NASA Astrophysics Data System (ADS)
Zhang, Qiaoping
For the last three decades, the Geomatics Engineering and Computer Science communities have considered automated road network extraction from remotely-sensed imagery to be a challenging and important research topic. The main objective of this research is to investigate the theory and methodology of automated feature extraction for image-based road database creation, refinement or updating, and to develop a series of algorithms for road network extraction from high resolution multi-spectral imagery. The proposed framework for road network extraction from multi-spectral imagery begins with an image segmentation using the k-means algorithm. This step mainly concerns the exploitation of the spectral information for feature extraction. The road cluster is automatically identified using a fuzzy classifier based on a set of predefined road surface membership functions. These membership functions are established based on the general spectral signature of road pavement materials and the corresponding normalized digital numbers on each multi-spectral band. Shape descriptors of the Angular Texture Signature are defined and used to reduce the misclassifications between roads and other spectrally similar objects (e.g., crop fields, parking lots, and buildings). An iterative and localized Radon transform is developed for the extraction of road centerlines from the classified images. The purpose of the transform is to accurately and completely detect the road centerlines. It is able to find short, long, and even curvilinear lines. The input image is partitioned into a set of subset images called road component images. An iterative Radon transform is locally applied to each road component image. At each iteration, road centerline segments are detected based on an accurate estimation of the line parameters and line widths. Three localization approaches are implemented and compared using qualitative and quantitative methods. Finally, the road centerline segments are grouped into a road network. The extracted road network is evaluated against a reference dataset using a line segment matching algorithm. The entire process is unsupervised and fully automated. Based on extensive experimentation on a variety of remotely-sensed multi-spectral images, the proposed methodology achieves a moderate success in automating road network extraction from high spatial resolution multi-spectral imagery.
Automated extraction of radiation dose information from CT dose report images.
Li, Xinhua; Zhang, Da; Liu, Bob
2011-06-01
The purpose of this article is to describe the development of an automated tool for retrieving texts from CT dose report images. Optical character recognition was adopted to perform text recognitions of CT dose report images. The developed tool is able to automate the process of analyzing multiple CT examinations, including text recognition, parsing, error correction, and exporting data to spreadsheets. The results were precise for total dose-length product (DLP) and were about 95% accurate for CT dose index and DLP of scanned series.
Toward Machine Understanding of Information Quality.
ERIC Educational Resources Information Center
Tang, Rong; Ng, K. B.; Strzalkowski, Tomek; Kantor, Paul B.
2003-01-01
Reports preliminary results of a study to develop and automate new metrics for assessment of information quality in text documents, particularly in news. Through focus group studies, quality judgment experiments, and textual feature extraction and analysis, nine quality aspects were generated and applied in human assessments. Experiments were…
Rapid System to Quantitatively Characterize the Airborne Microbial Community
NASA Technical Reports Server (NTRS)
Macnaughton, Sarah J.
1998-01-01
Bioaerosols have been linked to a wide range of different allergies and respiratory illnesses. Currently, microorganism culture is the most commonly used method for exposure assessment. Such culture techniques, however, generally fail to detect between 90-99% of the actual viable biomass. Consequently, an unbiased technique for detecting airborne microorganisms is essential. In this Phase II proposal, a portable air sampling device his been developed for the collection of airborne microbial biomass from indoor (and outdoor) environments. Methods were evaluated for extracting and identifying lipids that provide information on indoor air microbial biomass, and automation of these procedures was investigated. Also, techniques to automate the extraction of DNA were explored.
Automated extraction of Biomarker information from pathology reports.
Lee, Jeongeun; Song, Hyun-Je; Yoon, Eunsil; Park, Seong-Bae; Park, Sung-Hye; Seo, Jeong-Wook; Park, Peom; Choi, Jinwook
2018-05-21
Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.
Botsis, Taxiarchis; Foster, Matthew; Arya, Nina; Kreimeyer, Kory; Pandey, Abhishek; Arya, Deepa
2017-04-26
To evaluate the feasibility of automated dose and adverse event information retrieval in supporting the identification of safety patterns. We extracted all rabbit Anti-Thymocyte Globulin (rATG) reports submitted to the United States Food and Drug Administration Adverse Event Reporting System (FAERS) from the product's initial licensure in April 16, 1984 through February 8, 2016. We processed the narratives using the Medication Extraction (MedEx) and the Event-based Text-mining of Health Electronic Records (ETHER) systems and retrieved the appropriate medication, clinical, and temporal information. When necessary, the extracted information was manually curated. This process resulted in a high quality dataset that was analyzed with the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA) to explore the association of rATG dosing with post-transplant lymphoproliferative disorder (PTLD). Although manual curation was necessary to improve the data quality, MedEx and ETHER supported the extraction of the appropriate information. We created a final dataset of 1,380 cases with complete information for rATG dosing and date of administration. Analysis in PANACEA found that PTLD was associated with cumulative doses of rATG >8 mg/kg, even in periods where most of the submissions to FAERS reported low doses of rATG. We demonstrated the feasibility of investigating a dose-related safety pattern for a particular product in FAERS using a set of automated tools.
ALE: automated label extraction from GEO metadata.
Giles, Cory B; Brown, Chase A; Ripperger, Michael; Dennis, Zane; Roopnarinesingh, Xiavan; Porter, Hunter; Perz, Aleksandra; Wren, Jonathan D
2017-12-28
NCBI's Gene Expression Omnibus (GEO) is a rich community resource containing millions of gene expression experiments from human, mouse, rat, and other model organisms. However, information about each experiment (metadata) is in the format of an open-ended, non-standardized textual description provided by the depositor. Thus, classification of experiments for meta-analysis by factors such as gender, age of the sample donor, and tissue of origin is not feasible without assigning labels to the experiments. Automated approaches are preferable for this, primarily because of the size and volume of the data to be processed, but also because it ensures standardization and consistency. While some of these labels can be extracted directly from the textual metadata, many of the data available do not contain explicit text informing the researcher about the age and gender of the subjects with the study. To bridge this gap, machine-learning methods can be trained to use the gene expression patterns associated with the text-derived labels to refine label-prediction confidence. Our analysis shows only 26% of metadata text contains information about gender and 21% about age. In order to ameliorate the lack of available labels for these data sets, we first extract labels from the textual metadata for each GEO RNA dataset and evaluate the performance against a gold standard of manually curated labels. We then use machine-learning methods to predict labels, based upon gene expression of the samples and compare this to the text-based method. Here we present an automated method to extract labels for age, gender, and tissue from textual metadata and GEO data using both a heuristic approach as well as machine learning. We show the two methods together improve accuracy of label assignment to GEO samples.
NASA Astrophysics Data System (ADS)
Lancaster, N.; LeBlanc, D.; Bebis, G.; Nicolescu, M.
2015-12-01
Dune-field patterns are believed to behave as self-organizing systems, but what causes the patterns to form is still poorly understood. The most obvious (and in many cases the most significant) aspect of a dune system is the pattern of dune crest lines. Extracting meaningful features such as crest length, orientation, spacing, bifurcations, and merging of crests from image data can reveal important information about the specific dune-field morphological properties, development, and response to changes in boundary conditions, but manual methods are labor-intensive and time-consuming. We are developing the capability to recognize and characterize patterns of sand dunes on planetary surfaces. Our goal is to develop a robust methodology and the necessary algorithms for automated or semi-automated extraction of dune morphometric information from image data. Our main approach uses image processing methods to extract gradient information from satellite images of dune fields. Typically, the gradients have a dominant magnitude and orientation. In many cases, the images have two major dominant gradient orientations, for the sunny and shaded side of the dunes. A histogram of the gradient orientations is used to determine the dominant orientation. A threshold is applied to the image based on gradient orientations which agree with the dominant orientation. The contours of the binary image can then be used to determine the dune crest-lines, based on pixel intensity values. Once the crest-lines have been extracted, the morphological properties can be computed. We have tested our approach on a variety of images of linear and crescentic (transverse) dunes and compared dune detection algorithms with manually-digitized dune crest lines, achieving true positive values of 0.57-0.99; and false positives values of 0.30-0.67, indicating that out approach is generally robust.
Automated extraction of chemical structure information from digital raster images
Park, Jungkap; Rosania, Gus R; Shedden, Kerby A; Nguyen, Mandee; Lyu, Naesung; Saitou, Kazuhiro
2009-01-01
Background To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated. Results This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns. Conclusion The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles. PMID:19196483
ERIC Educational Resources Information Center
Zhang, Rui
2013-01-01
The widespread adoption of Electronic Health Record (EHR) has resulted in rapid text proliferation within clinical care. Clinicians' use of copying and pasting functions in EHR systems further compounds this by creating a large amount of redundant clinical information in clinical documents. A mixture of redundant information (especially outdated…
Kulstein, Galina; Marienfeld, Ralf; Miltner, Erich; Wiegand, Peter
2016-10-01
In the last years, microRNA (miRNA) analysis came into focus in the field of forensic genetics. Yet, no standardized and recommendable protocols for co-isolation of miRNA and DNA from forensic relevant samples have been developed so far. Hence, this study evaluated the performance of an automated Maxwell® 16 System-based strategy (Promega) for co-extraction of DNA and miRNA from forensically relevant (blood and saliva) samples compared to (semi-)manual extraction methods. Three procedures were compared on the basis of recovered quantity of DNA and miRNA (as determined by real-time PCR and Bioanalyzer), miRNA profiling (shown by Cq values and extraction efficiency), STR profiles, duration, contamination risk and handling. All in all, the results highlight that the automated co-extraction procedure yielded the highest miRNA and DNA amounts from saliva and blood samples compared to both (semi-)manual protocols. Also, for aged and genuine samples of forensically relevant traces the miRNA and DNA yields were sufficient for subsequent downstream analysis. Furthermore, the strategy allows miRNA extraction only in cases where it is relevant to obtain additional information about the sample type. Besides, this system enables flexible sample throughput and labor-saving sample processing with reduced risk of cross-contamination. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[DNA Extraction from Old Bones by AutoMate Express™ System].
Li, B; Lü, Z
2017-08-01
To establish a method for extracting DNA from old bones by AutoMate Express™ system. Bones were grinded into powder by freeze-mill. After extraction by AutoMate Express™, DNA were amplified and genotyped by Identifiler®Plus and MinFiler™ kits. DNA were extracted from 10 old bone samples, which kept in different environments with the postmortem interval from 10 to 20 years, in 3 hours by AutoMate Express™ system. Complete STR typing results were obtained from 8 samples. AutoMate Express™ system can quickly and efficiently extract DNA from old bones, which can be applied in forensic practice. Copyright© by the Editorial Department of Journal of Forensic Medicine
NASA Technical Reports Server (NTRS)
Pokras, V. M.; Yevdokimov, V. P.; Maslov, V. D.
1978-01-01
The structure and potential of the information reference system OZhUR designed for the automated data processing systems of scientific space vehicles (SV) is considered. The system OZhUR ensures control of the extraction phase of processing with respect to a concrete SV and the exchange of data between phases.The practical application of the system OZhUR is exemplified in the construction of a data processing system for satellites of the Cosmos series. As a result of automating the operations of exchange and control, the volume of manual preparation of data is significantly reduced, and there is no longer any need for individual logs which fix the status of data processing. The system Ozhur is included in the automated data processing system Nauka which is realized in language PL-1 in a binary one-address system one-state (BOS OS) electronic computer.
Whitter, P D; Cary, P L; Leaton, J I; Johnson, J E
1999-01-01
An automated extraction scheme for the analysis of 11 -nor-delta9-tetrahydrocannabinol-9-carboxylic acid using the Hamilton Microlab 2200, which was modified for gravity-flow solid-phase extraction, has been evaluated. The Hamilton was fitted with a six-head probe, a modular valve positioner, and a peristaltic pump. The automated method significantly increased sample throughput, improved assay consistency, and reduced the time spent performing the extraction. Extraction recovery for the automated method was > 90%. The limit of detection, limit of quantitation, and upper limit of linearity were equivalent to the manual method: 1.5, 3.0, and 300 ng/mL, respectively. Precision at the 15-ng/mL cut-off was as follows: mean = 14.4, standard deviation = 0.5, coefficient of variation = 3.5%. Comparison of 38 patient samples, extracted by the manual and automated extraction methods, demonstrated the following correlation statistics: r = .991, slope 1.029, and y-intercept -2.895. Carryover was < 0.3% at 1000 ng/mL. Aliquoting/extraction time for the automated method (48 urine samples) was 50 min, and the manual procedure required approximately 2.5 h. The automated aliquoting/extraction method on the Hamilton Microlab 2200 and its use in forensic applications are reviewed.
NASA Astrophysics Data System (ADS)
Dubey, Kavita; Srivastava, Vishal; Singh Mehta, Dalip
2018-04-01
Early identification of fungal infection on the human scalp is crucial for avoiding hair loss. The diagnosis of fungal infection on the human scalp is based on a visual assessment by trained experts or doctors. Optical coherence tomography (OCT) has the ability to capture fungal infection information from the human scalp with a high resolution. In this study, we present a fully automated, non-contact, non-invasive optical method for rapid detection of fungal infections based on the extracted features from A-line and B-scan images of OCT. A multilevel ensemble machine model is designed to perform automated classification, which shows the superiority of our classifier to the best classifier based on the features extracted from OCT images. In this study, 60 samples (30 fungal, 30 normal) were imaged by OCT and eight features were extracted. The classification algorithm had an average sensitivity, specificity and accuracy of 92.30, 90.90 and 91.66%, respectively, for identifying fungal and normal human scalps. This remarkable classifying ability makes the proposed model readily applicable to classifying the human scalp.
Alert management for home healthcare based on home automation analysis.
Truong, T T; de Lamotte, F; Diguet, J-Ph; Said-Hocine, F
2010-01-01
Rising healthcare for elder and disabled people can be controlled by offering people autonomy at home by means of information technology. In this paper, we present an original and sensorless alert management solution which performs multimedia and home automation service discrimination and extracts highly regular home activities as sensors for alert management. The results of simulation data, based on real context, allow us to evaluate our approach before application to real data.
[DNA extraction from bones and teeth using AutoMate Express forensic DNA extraction system].
Gao, Lin-Lin; Xu, Nian-Lai; Xie, Wei; Ding, Shao-Cheng; Wang, Dong-Jing; Ma, Li-Qin; Li, You-Ying
2013-04-01
To explore a new method in order to extract DNA from bones and teeth automatically. Samples of 33 bones and 15 teeth were acquired by freeze-mill method and manual method, respectively. DNA materials were extracted and quantified from the triturated samples by AutoMate Express forensic DNA extraction system. DNA extraction from bones and teeth were completed in 3 hours using the AutoMate Express forensic DNA extraction system. There was no statistical difference between the two methods in the DNA concentration of bones. Both bones and teeth got the good STR typing by freeze-mill method, and the DNA concentration of teeth was higher than those by manual method. AutoMate Express forensic DNA extraction system is a new method to extract DNA from bones and teeth, which can be applied in forensic practice.
NASA Technical Reports Server (NTRS)
Imhoff, M. L.; Vermillion, C. H.; Khan, F. A.
1984-01-01
An investigation to examine the utility of spaceborne radar image data to malaria vector control programs is described. Specific tasks involve an analysis of radar illumination geometry vs information content, the synergy of radar and multispectral data mergers, and automated information extraction techniques.
Garvin, Jennifer H; DuVall, Scott L; South, Brett R; Bray, Bruce E; Bolton, Daniel; Heavirland, Julia; Pickard, Steve; Heidenreich, Paul; Shen, Shuying; Weir, Charlene; Samore, Matthew; Goldstein, Mary K
2012-01-01
Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics. We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements. System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%). An EF value of <40% can be accurately identified in VA echocardiogram reports. An automated information extraction system can be used to accurately extract EF for quality measurement.
Design and Development of the Terrain Information Extraction System
1990-09-04
system successfully demonstrated relief measurement and orthophoto production, automated feature extraction has remained "the major problem of today’s...the hierarchical relaxation correlation method developed by Helava Associates, Inc. and digital orthophoto production. To achieve this high accuracy...image memory transfer rates will be achieved by using data blocks or "image tiles ." Further, an image fringe loading module will be implemented which
Liu, Xin; Yetik, Imam Samil
2011-06-01
Multiparametric magnetic resonance imaging (MRI) has been shown to have higher localization accuracy than transrectal ultrasound (TRUS) for prostate cancer. Therefore, automated cancer segmentation using multiparametric MRI is receiving a growing interest, since MRI can provide both morphological and functional images for tissue of interest. However, all automated methods to this date are applicable to a single zone of the prostate, and the peripheral zone (PZ) of the prostate needs to be extracted manually, which is a tedious and time-consuming job. In this paper, our goal is to remove the need of PZ extraction by incorporating the spatial and geometric information of prostate tumors with multiparametric MRI derived from T2-weighted MRI, diffusion-weighted imaging (DWI) and dynamic contrast enhanced MRI (DCE-MRI). In order to remove the need of PZ extraction, the authors propose a new method to incorporate the spatial information of the cancer. This is done by introducing a new feature called location map. This new feature is constructed by applying a nonlinear transformation to the spatial position coordinates of each pixel, so that the location map implicitly represents the geometric position of each pixel with respect to the prostate region. Then, this new feature is combined with multiparametric MR images to perform tumor localization. The proposed algorithm is applied to multiparametric prostate MRI data obtained from 20 patients with biopsy-confirmed prostate cancer. The proposed method which does not need the masks of PZ was found to have prostate cancer detection specificity of 0.84, sensitivity of 0.80 and dice coefficient value of 0.42. The authors have found that fusing the spatial information allows us to obtain tumor outline without the need of PZ extraction with a considerable success (better or similar performance to methods that require manual PZ extraction). Our experimental results quantitatively demonstrate the effectiveness of the proposed method, depicting that the proposed method has a slightly better or similar localization performance compared to methods which require the masks of PZ.
Misra, Dharitri; Chen, Siyuan; Thoma, George R
2009-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques.At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts.In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system.
Kesner, Adam Leon; Kuntner, Claudia
2010-10-01
Respiratory gating in PET is an approach used to minimize the negative effects of respiratory motion on spatial resolution. It is based on an initial determination of a patient's respiratory movements during a scan, typically using hardware based systems. In recent years, several fully automated databased algorithms have been presented for extracting a respiratory signal directly from PET data, providing a very practical strategy for implementing gating in the clinic. In this work, a new method is presented for extracting a respiratory signal from raw PET sinogram data and compared to previously presented automated techniques. The acquisition of respiratory signal from PET data in the newly proposed method is based on rebinning the sinogram data into smaller data structures and then analyzing the time activity behavior in the elements of these structures. From this analysis, a 1D respiratory trace is produced, analogous to a hardware derived respiratory trace. To assess the accuracy of this fully automated method, respiratory signal was extracted from a collection of 22 clinical FDG-PET scans using this method, and compared to signal derived from several other software based methods as well as a signal derived from a hardware system. The method presented required approximately 9 min of processing time for each 10 min scan (using a single 2.67 GHz processor), which in theory can be accomplished while the scan is being acquired and therefore allowing a real-time respiratory signal acquisition. Using the mean correlation between the software based and hardware based respiratory traces, the optimal parameters were determined for the presented algorithm. The mean/median/range of correlations for the set of scans when using the optimal parameters was found to be 0.58/0.68/0.07-0.86. The speed of this method was within the range of real-time while the accuracy surpassed the most accurate of the previously presented algorithms. PET data inherently contains information about patient motion; information that is not currently being utilized. We have shown that a respiratory signal can be extracted from raw PET data in potentially real-time and in a fully automated manner. This signal correlates well with hardware based signal for a large percentage of scans, and avoids the efforts and complications associated with hardware. The proposed method to extract a respiratory signal can be implemented on existing scanners and, if properly integrated, can be applied without changes to routine clinical procedures.
Building Facade Reconstruction by Fusing Terrestrial Laser Points and Images
Pu, Shi; Vosselman, George
2009-01-01
Laser data and optical data have a complementary nature for three dimensional feature extraction. Efficient integration of the two data sources will lead to a more reliable and automated extraction of three dimensional features. This paper presents a semiautomatic building facade reconstruction approach, which efficiently combines information from terrestrial laser point clouds and close range images. A building facade's general structure is discovered and established using the planar features from laser data. Then strong lines in images are extracted using Canny extractor and Hough transformation, and compared with current model edges for necessary improvement. Finally, textures with optimal visibility are selected and applied according to accurate image orientations. Solutions to several challenge problems throughout the collaborated reconstruction, such as referencing between laser points and multiple images and automated texturing, are described. The limitations and remaining works of this approach are also discussed. PMID:22408539
Research in satellite-aided crop inventory and monitoring
NASA Technical Reports Server (NTRS)
Erickson, J. D.; Dragg, J. L.; Bizzell, R. M.; Trichel, M. C. (Principal Investigator)
1982-01-01
Automated information extraction procedures for analysis of multitemporal LANDSAT data in non-U.S. crop inventory and monitoring are reviewed. Experiments to develope and evaluate crop area estimation technologies for spring small grains, summer crops, corn, and soybeans are discussed.
Witt, Sebastian; Neumann, Jan; Zierdt, Holger; Gébel, Gabriella; Röscheisen, Christiane
2012-09-01
Automated systems have been increasingly utilized for DNA extraction by many forensic laboratories to handle growing numbers of forensic casework samples while minimizing the risk of human errors and assuring high reproducibility. The step towards automation however is not easy: The automated extraction method has to be very versatile to reliably prepare high yields of pure genomic DNA from a broad variety of sample types on different carrier materials. To prevent possible cross-contamination of samples or the loss of DNA, the components of the kit have to be designed in a way that allows for the automated handling of the samples with no manual intervention necessary. DNA extraction using paramagnetic particles coated with a DNA-binding surface is predestined for an automated approach. For this study, we tested different DNA extraction kits using DNA-binding paramagnetic particles with regard to DNA yield and handling by a Freedom EVO(®)150 extraction robot (Tecan) equipped with a Te-MagS magnetic separator. Among others, the extraction kits tested were the ChargeSwitch(®)Forensic DNA Purification Kit (Invitrogen), the PrepFiler™Automated Forensic DNA Extraction Kit (Applied Biosystems) and NucleoMag™96 Trace (Macherey-Nagel). After an extensive test phase, we established a novel magnetic bead extraction method based upon the NucleoMag™ extraction kit (Macherey-Nagel). The new method is readily automatable and produces high yields of DNA from different sample types (blood, saliva, sperm, contact stains) on various substrates (filter paper, swabs, cigarette butts) with no evidence of a loss of magnetic beads or sample cross-contamination. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Quiñones, Karin D; Su, Hua; Marshall, Byron; Eggers, Shauna; Chen, Hsinchun
2007-09-01
Explosive growth in biomedical research has made automated information extraction, knowledge integration, and visualization increasingly important and critically needed. The Arizona BioPathway (ABP) system extracts and displays biological regulatory pathway information from the abstracts of journal articles. This study uses relations extracted from more than 200 PubMed abstracts presented in a tabular and graphical user interface with built-in search and aggregation functionality. This paper presents a task-centered assessment of the usefulness and usability of the ABP system focusing on its relation aggregation and visualization functionalities. Results suggest that our graph-based visualization is more efficient in supporting pathway analysis tasks and is perceived as more useful and easier to use as compared to a text-based literature-viewing method. Relation aggregation significantly contributes to knowledge-acquisition efficiency. Together, the graphic and tabular views in the ABP Visualizer provide a flexible and effective interface for pathway relation browsing and analysis. Our study contributes to pathway-related research and biological information extraction by assessing the value of a multiview, relation-based interface that supports user-controlled exploration of pathway information across multiple granularities.
Song, Yang; Cai, Weidong; Feng, David Dagan; Chen, Mei
2013-01-01
Automated segmentation of cell nuclei in microscopic images is critical to high throughput analysis of the ever increasing amount of data. Although cell nuclei are generally visually distinguishable for human, automated segmentation faces challenges when there is significant intensity inhomogeneity among cell nuclei or in the background. In this paper, we propose an effective method for automated cell nucleus segmentation using a three-step approach. It first obtains an initial segmentation by extracting salient regions in the image, then reduces false positives using inter-region feature discrimination, and finally refines the boundary of the cell nuclei using intra-region contrast information. This method has been evaluated on two publicly available datasets of fluorescence microscopic images with 4009 cells, and has achieved superior performance compared to popular state of the art methods using established metrics.
Automated, per pixel Cloud Detection from High-Resolution VNIR Data
NASA Technical Reports Server (NTRS)
Varlyguin, Dmitry L.
2007-01-01
CASA is a fully automated software program for the per-pixel detection of clouds and cloud shadows from medium- (e.g., Landsat, SPOT, AWiFS) and high- (e.g., IKONOS, QuickBird, OrbView) resolution imagery without the use of thermal data. CASA is an object-based feature extraction program which utilizes a complex combination of spectral, spatial, and contextual information available in the imagery and the hierarchical self-learning logic for accurate detection of clouds and their shadows.
Support patient search on pathology reports with interactive online learning based data extraction.
Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng
2015-01-01
Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
SAR matrices: automated extraction of information-rich SAR tables from large compound data sets.
Wassermann, Anne Mai; Haebel, Peter; Weskamp, Nils; Bajorath, Jürgen
2012-07-23
We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.
Systematic review automation technologies
2014-01-01
Systematic reviews, a cornerstone of evidence-based medicine, are not produced quickly enough to support clinical practice. The cost of production, availability of the requisite expertise and timeliness are often quoted as major contributors for the delay. This detailed survey of the state of the art of information systems designed to support or automate individual tasks in the systematic review, and in particular systematic reviews of randomized controlled clinical trials, reveals trends that see the convergence of several parallel research projects. We surveyed literature describing informatics systems that support or automate the processes of systematic review or each of the tasks of the systematic review. Several projects focus on automating, simplifying and/or streamlining specific tasks of the systematic review. Some tasks are already fully automated while others are still largely manual. In this review, we describe each task and the effect that its automation would have on the entire systematic review process, summarize the existing information system support for each task, and highlight where further research is needed for realizing automation for the task. Integration of the systems that automate systematic review tasks may lead to a revised systematic review workflow. We envisage the optimized workflow will lead to system in which each systematic review is described as a computer program that automatically retrieves relevant trials, appraises them, extracts and synthesizes data, evaluates the risk of bias, performs meta-analysis calculations, and produces a report in real time. PMID:25005128
Meystre, Stéphane M; Ferrández, Óscar; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H
2014-08-01
As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between select clinical information annotated in the 2010 i2b2 NLP challenge corpus and automatic PHI annotations from our best-of-breed VHA clinical text de-identification system (nicknamed 'BoB'). Overall, only 0.81% of the clinical information exactly overlapped with PHI, and 1.78% partly overlapped. We conclude that automated text de-identification's impact on clinical information is small, but not negligible, and that improved clinical acronyms and eponyms disambiguation could significantly reduce this impact. Copyright © 2014 Elsevier Inc. All rights reserved.
DuVall, Scott L; South, Brett R; Bray, Bruce E; Bolton, Daniel; Heavirland, Julia; Pickard, Steve; Heidenreich, Paul; Shen, Shuying; Weir, Charlene; Samore, Matthew; Goldstein, Mary K
2012-01-01
Objectives Left ventricular ejection fraction (EF) is a key component of heart failure quality measures used within the Department of Veteran Affairs (VA). Our goals were to build a natural language processing system to extract the EF from free-text echocardiogram reports to automate measurement reporting and to validate the accuracy of the system using a comparison reference standard developed through human review. This project was a Translational Use Case Project within the VA Consortium for Healthcare Informatics. Materials and methods We created a set of regular expressions and rules to capture the EF using a random sample of 765 echocardiograms from seven VA medical centers. The documents were randomly assigned to two sets: a set of 275 used for training and a second set of 490 used for testing and validation. To establish the reference standard, two independent reviewers annotated all documents in both sets; a third reviewer adjudicated disagreements. Results System test results for document-level classification of EF of <40% had a sensitivity (recall) of 98.41%, a specificity of 100%, a positive predictive value (precision) of 100%, and an F measure of 99.2%. System test results at the concept level had a sensitivity of 88.9% (95% CI 87.7% to 90.0%), a positive predictive value of 95% (95% CI 94.2% to 95.9%), and an F measure of 91.9% (95% CI 91.2% to 92.7%). Discussion An EF value of <40% can be accurately identified in VA echocardiogram reports. Conclusions An automated information extraction system can be used to accurately extract EF for quality measurement. PMID:22437073
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov
Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V
2016-01-01
Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. Methods: We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. Results and Discussion: The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. PMID:27013523
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov.
Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V; Xu, Hua
2016-07-01
Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Misra, Dharitri; Chen, Siyuan; Thoma, George R.
2010-01-01
One of the most expensive aspects of archiving digital documents is the manual acquisition of context-sensitive metadata useful for the subsequent discovery of, and access to, the archived items. For certain types of textual documents, such as journal articles, pamphlets, official government records, etc., where the metadata is contained within the body of the documents, a cost effective method is to identify and extract the metadata in an automated way, applying machine learning and string pattern search techniques. At the U. S. National Library of Medicine (NLM) we have developed an automated metadata extraction (AME) system that employs layout classification and recognition models with a metadata pattern search model for a text corpus with structured or semi-structured information. A combination of Support Vector Machine and Hidden Markov Model is used to create the layout recognition models from a training set of the corpus, following which a rule-based metadata search model is used to extract the embedded metadata by analyzing the string patterns within and surrounding each field in the recognized layouts. In this paper, we describe the design of our AME system, with focus on the metadata search model. We present the extraction results for a historic collection from the Food and Drug Administration, and outline how the system may be adapted for similar collections. Finally, we discuss some ongoing enhancements to our AME system. PMID:21179386
Automated Ontology Generation Using Spatial Reasoning
NASA Astrophysics Data System (ADS)
Coalter, Alton; Leopold, Jennifer L.
Recently there has been much interest in using ontologies to facilitate knowledge representation, integration, and reasoning. Correspondingly, the extent of the information embodied by an ontology is increasing beyond the conventional is_a and part_of relationships. To address these requirements, a vast amount of digitally available information may need to be considered when building ontologies, prompting a desire for software tools to automate at least part of the process. The main efforts in this direction have involved textual information retrieval and extraction methods. For some domains extension of the basic relationships could be enhanced further by the analysis of 2D and/or 3D images. For this type of media, image processing algorithms are more appropriate than textual analysis methods. Herein we present an algorithm that, given a collection of 3D image files, utilizes Qualitative Spatial Reasoning (QSR) to automate the creation of an ontology for the objects represented by the images, relating the objects in terms of is_a and part_of relationships and also through unambiguous Relational Connection Calculus (RCC) relations.
AAlAbdulsalam, Abdulrahman K.; Garvin, Jennifer H.; Redd, Andrew; Carter, Marjorie E.; Sweeny, Carol; Meystre, Stephane M.
2018-01-01
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). PMID:29888032
AAlAbdulsalam, Abdulrahman K; Garvin, Jennifer H; Redd, Andrew; Carter, Marjorie E; Sweeny, Carol; Meystre, Stephane M
2018-01-01
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%-98.4% and classification sensitivity: 83.5%-87%).
DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures
Yin, Xu-Cheng; Yang, Chun; Pei, Wei-Yi; Man, Haixia; Zhang, Jun; Learned-Miller, Erik; Yu, Hong
2015-01-01
Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/. PMID:25951377
Nagy, Paul G; Warnock, Max J; Daly, Mark; Toland, Christopher; Meenan, Christopher D; Mezrich, Reuben S
2009-11-01
Radiology departments today are faced with many challenges to improve operational efficiency, performance, and quality. Many organizations rely on antiquated, paper-based methods to review their historical performance and understand their operations. With increased workloads, geographically dispersed image acquisition and reading sites, and rapidly changing technologies, this approach is increasingly untenable. A Web-based dashboard was constructed to automate the extraction, processing, and display of indicators and thereby provide useful and current data for twice-monthly departmental operational meetings. The feasibility of extracting specific metrics from clinical information systems was evaluated as part of a longer-term effort to build a radiology business intelligence architecture. Operational data were extracted from clinical information systems and stored in a centralized data warehouse. Higher-level analytics were performed on the centralized data, a process that generated indicators in a dynamic Web-based graphical environment that proved valuable in discussion and root cause analysis. Results aggregated over a 24-month period since implementation suggest that this operational business intelligence reporting system has provided significant data for driving more effective management decisions to improve productivity, performance, and quality of service in the department.
Automated feature extraction and classification from image sources
,
1995-01-01
The U.S. Department of the Interior, U.S. Geological Survey (USGS), and Unisys Corporation have completed a cooperative research and development agreement (CRADA) to explore automated feature extraction and classification from image sources. The CRADA helped the USGS define the spectral and spatial resolution characteristics of airborne and satellite imaging sensors necessary to meet base cartographic and land use and land cover feature classification requirements and help develop future automated geographic and cartographic data production capabilities. The USGS is seeking a new commercial partner to continue automated feature extraction and classification research and development.
A COMPARISON OF AUTOMATED AND TRADITIONAL METHODS FOR THE EXTRACTION OF ARSENICALS FROM FISH
An automated extractor employing accelerated solvent extraction (ASE) has been compared with a traditional sonication method of extraction for the extraction of arsenicals from fish tissue. Four different species of fish and a standard reference material, DORM-2, were subjected t...
FEX: A Knowledge-Based System For Planimetric Feature Extraction
NASA Astrophysics Data System (ADS)
Zelek, John S.
1988-10-01
Topographical planimetric features include natural surfaces (rivers, lakes) and man-made surfaces (roads, railways, bridges). In conventional planimetric feature extraction, a photointerpreter manually interprets and extracts features from imagery on a stereoplotter. Visual planimetric feature extraction is a very labour intensive operation. The advantages of automating feature extraction include: time and labour savings; accuracy improvements; and planimetric data consistency. FEX (Feature EXtraction) combines techniques from image processing, remote sensing and artificial intelligence for automatic feature extraction. The feature extraction process co-ordinates the information and knowledge in a hierarchical data structure. The system simulates the reasoning of a photointerpreter in determining the planimetric features. Present efforts have concentrated on the extraction of road-like features in SPOT imagery. Keywords: Remote Sensing, Artificial Intelligence (AI), SPOT, image understanding, knowledge base, apars.
Ambert, Kyle H; Cohen, Aaron M
2009-01-01
OBJECTIVE Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task. DESIGN A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines. MEASUREMENTS Performance was evaluated in terms of the macroaveraged F1 measure. RESULTS The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13(th) in the textual task. CONCLUSIONS The system demonstrates that effective comorbidity status classification by an automated system is possible.
Mathieson, William; Guljar, Nafia; Sanchez, Ignacio; Sroya, Manveer; Thomas, Gerry A
2018-05-03
DNA extracted from formalin-fixed, paraffin-embedded (FFPE) tissue blocks is amenable to analytical techniques, including sequencing. DNA extraction protocols are typically long and complex, often involving an overnight proteinase K digest. Automated platforms that shorten and simplify the process are therefore an attractive proposition for users wanting a faster turn-around or to process large numbers of biospecimens. It is, however, unclear whether automated extraction systems return poorer DNA yields or quality than manual extractions performed by experienced technicians. We extracted DNA from 42 FFPE clinical tissue biospecimens using the QiaCube (Qiagen) and ExScale (ExScale Biospecimen Solutions) automated platforms, comparing DNA yields and integrities with those from manual extractions. The QIAamp DNA FFPE Spin Column Kit was used for manual and QiaCube DNA extractions and the ExScale extractions were performed using two of the manufacturer's magnetic bead kits: one extracting DNA only and the other simultaneously extracting DNA and RNA. In all automated extraction methods, DNA yields and integrities (assayed using DNA Integrity Numbers from a 4200 TapeStation and the qPCR-based Illumina FFPE QC Assay) were poorer than in the manual method, with the QiaCube system performing better than the ExScale system. However, ExScale was fastest, offered the highest reproducibility when extracting DNA only, and required the least intervention or technician experience. Thus, the extraction methods have different strengths and weaknesses, would appeal to different users with different requirements, and therefore, we cannot recommend one method over another.
Automated rule-base creation via CLIPS-Induce
NASA Technical Reports Server (NTRS)
Murphy, Patrick M.
1994-01-01
Many CLIPS rule-bases contain one or more rule groups that perform classification. In this paper we describe CLIPS-Induce, an automated system for the creation of a CLIPS classification rule-base from a set of test cases. CLIPS-Induce consists of two components, a decision tree induction component and a CLIPS production extraction component. ID3, a popular decision tree induction algorithm, is used to induce a decision tree from the test cases. CLIPS production extraction is accomplished through a top-down traversal of the decision tree. Nodes of the tree are used to construct query rules, and branches of the tree are used to construct classification rules. The learned CLIPS productions may easily be incorporated into a large CLIPS system that perform tasks such as accessing a database or displaying information.
Automated detection and classification of dice
NASA Astrophysics Data System (ADS)
Correia, Bento A. B.; Silva, Jeronimo A.; Carvalho, Fernando D.; Guilherme, Rui; Rodrigues, Fernando C.; de Silva Ferreira, Antonio M.
1995-03-01
This paper describes a typical machine vision system in an unusual application, the automated visual inspection of a Casino's playing tables. The SORTE computer vision system was developed at INETI under a contract with the Portuguese Gaming Inspection Authorities IGJ. It aims to automate the tasks of detection and classification of the dice's scores on the playing tables of the game `Banca Francesa' (which means French Banking) in Casinos. The system is based on the on-line analysis of the images captured by a monochrome CCD camera placed over the playing tables, in order to extract relevant information concerning the score indicated by the dice. Image processing algorithms for real time automatic throwing detection and dice classification were developed and implemented.
Information Extraction for System-Software Safety Analysis: Calendar Year 2008 Year-End Report
NASA Technical Reports Server (NTRS)
Malin, Jane T.
2009-01-01
This annual report describes work to integrate a set of tools to support early model-based analysis of failures and hazards due to system-software interactions. The tools perform and assist analysts in the following tasks: 1) extract model parts from text for architecture and safety/hazard models; 2) combine the parts with library information to develop the models for visualization and analysis; 3) perform graph analysis and simulation to identify and evaluate possible paths from hazard sources to vulnerable entities and functions, in nominal and anomalous system-software configurations and scenarios; and 4) identify resulting candidate scenarios for software integration testing. There has been significant technical progress in model extraction from Orion program text sources, architecture model derivation (components and connections) and documentation of extraction sources. Models have been derived from Internal Interface Requirements Documents (IIRDs) and FMEA documents. Linguistic text processing is used to extract model parts and relationships, and the Aerospace Ontology also aids automated model development from the extracted information. Visualizations of these models assist analysts in requirements overview and in checking consistency and completeness.
Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe
2016-01-01
Introduction Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. Material and methods In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. Results The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (p < 0.001). For rectal cancer, these times were 13.5 ± 4.1 and 6.8 ± 2.4 min, respectively (p < 0.001). In 3.2% of the data collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Conclusions Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases. PMID:23394741
Roelofs, Erik; Persoon, Lucas; Nijsten, Sebastiaan; Wiessler, Wolfgang; Dekker, André; Lambin, Philippe
2013-07-01
Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field. In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method. The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (p<0.001). For rectal cancer, these times were 13.5 ± 4.1 and 6.8 ± 2.4 min, respectively (p<0.001). In 3.2% of the data collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method. Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Dogon-Yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.
2016-09-01
Mapping of trees plays an important role in modern urban spatial data management, as many benefits and applications inherit from this detailed up-to-date data sources. Timely and accurate acquisition of information on the condition of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting trees include ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraints, such as labour intensive field work and a lot of financial requirement which can be overcome by means of integrated LiDAR and digital image datasets. Compared to predominant studies on trees extraction mainly in purely forested areas, this study concentrates on urban areas, which have a high structural complexity with a multitude of different objects. This paper presented a workflow about semi-automated approach for extracting urban trees from integrated processing of airborne based LiDAR point cloud and multispectral digital image datasets over Istanbul city of Turkey. The paper reveals that the integrated datasets is a suitable technology and viable source of information for urban trees management. As a conclusion, therefore, the extracted information provides a snapshot about location, composition and extent of trees in the study area useful to city planners and other decision makers in order to understand how much canopy cover exists, identify new planting, removal, or reforestation opportunities and what locations have the greatest need or potential to maximize benefits of return on investment. It can also help track trends or changes to the urban trees over time and inform future management decisions.
AUTOMATED SOLID PHASE EXTRACTION GC/MS FOR ANALYSIS OF SEMIVOLATILES IN WATER AND SEDIMENTS
Data is presented on the development of a new automated system combining solid phase extraction (SPE) with GC/MS spectrometry for the single-run analysis of water samples containing a broad range of organic compounds. The system uses commercially available automated in-line sampl...
NASA Astrophysics Data System (ADS)
Mori, Shintaro; Hara, Takeshi; Tagami, Motoki; Muramatsu, Chicako; Kaneda, Takashi; Katsumata, Akitoshi; Fujita, Hiroshi
2013-02-01
Inflammation in paranasal sinus sometimes becomes chronic to take long terms for the treatment. The finding is important for the early treatment, but general dentists may not recognize the findings because they focus on teeth treatments. The purpose of this study was to develop a computer-aided detection (CAD) system for the inflammation in paranasal sinus on dental panoramic radiographs (DPRs) by using the mandible contour and to demonstrate the potential usefulness of the CAD system by means of receiver operating characteristic analysis. The detection scheme consists of 3 steps: 1) Contour extraction of mandible, 2) Contralateral subtraction, and 3) Automated detection. The Canny operator and active contour model were applied to extract the edge at the first step. At the subtraction step, the right region of the extracted contour image was flipped to compare with the left region. Mutual information between two selected regions was obtained to estimate the shift parameters of image registration. The subtraction images were generated based on the shift parameter. Rectangle regions of left and right paranasal sinus on the subtraction image were determined based on the size of mandible. The abnormal side of the regions was determined by taking the difference between the averages of each region. Thirteen readers were responded to all cases without and with the automated results. The averaged AUC of all readers was increased from 0.69 to 0.73 with statistical significance (p=0.032) when the automated detection results were provided. In conclusion, the automated detection method based on contralateral subtraction technique improves readers' interpretation performance of inflammation in paranasal sinus on DPRs.
Imitating manual curation of text-mined facts in biomedicine.
Rodriguez-Esteban, Raul; Iossifov, Ivan; Rzhetsky, Andrey
2006-09-08
Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.
Automated software system for checking the structure and format of ACM SIG documents
NASA Astrophysics Data System (ADS)
Mirza, Arsalan Rahman; Sah, Melike
2017-04-01
Microsoft (MS) Office Word is one of the most commonly used software tools for creating documents. MS Word 2007 and above uses XML to represent the structure of MS Word documents. Metadata about the documents are automatically created using Office Open XML (OOXML) syntax. We develop a new framework, which is called ADFCS (Automated Document Format Checking System) that takes the advantage of the OOXML metadata, in order to extract semantic information from MS Office Word documents. In particular, we develop a new ontology for Association for Computing Machinery (ACM) Special Interested Group (SIG) documents for representing the structure and format of these documents by using OWL (Web Ontology Language). Then, the metadata is extracted automatically in RDF (Resource Description Framework) according to this ontology using the developed software. Finally, we generate extensive rules in order to infer whether the documents are formatted according to ACM SIG standards. This paper, introduces ACM SIG ontology, metadata extraction process, inference engine, ADFCS online user interface, system evaluation and user study evaluations.
Two Different Approaches to Automated Mark Up of Emotions in Text
NASA Astrophysics Data System (ADS)
Francisco, Virginia; Hervás, Raqucl; Gervás, Pablo
This paper presents two different approaches to automated marking up of texts with emotional labels. For the first approach a corpus of example texts previously annotated by human evaluators is mined for an initial assignment of emotional features to words. This results in a List of Emotional Words (LEW) which becomes a useful resource for later automated mark up. The mark up algorithm in this first approach mirrors closely the steps taken during feature extraction, employing for the actual assignment of emotional features a combination of the LEW resource and WordNet for knowledge-based expansion of words not occurring in LEW. The algorithm for automated mark up is tested against new text samples to test its coverage. The second approach mark up texts during their generation. We have a knowledge base which contains the necessary information for marking up the text. This information is related to actions and characters. The algorithm in this case employ the information of the knowledge database and decides the correct emotion for every sentence. The algorithm for automated mark up is tested against four different texts. The results of the two approaches are compared and discussed with respect to three main issues: relative adequacy of each one of the representations used, correctness and coverage of the proposed algorithms, and additional techniques and solutions that may be employed to improve the results.
Considering context: reliable entity networks through contextual relationship extraction
NASA Astrophysics Data System (ADS)
David, Peter; Hawes, Timothy; Hansen, Nichole; Nolan, James J.
2016-05-01
Existing information extraction techniques can only partially address the problem of exploiting unreadable-large amounts text. When discussion of events and relationships is limited to simple, past-tense, factual descriptions of events, current NLP-based systems can identify events and relationships and extract a limited amount of additional information. But the simple subset of available information that existing tools can extract from text is only useful to a small set of users and problems. Automated systems need to find and separate information based on what is threatened or planned to occur, has occurred in the past, or could potentially occur. We address the problem of advanced event and relationship extraction with our event and relationship attribute recognition system, which labels generic, planned, recurring, and potential events. The approach is based on a combination of new machine learning methods, novel linguistic features, and crowd-sourced labeling. The attribute labeler closes the gap between structured event and relationship models and the complicated and nuanced language that people use to describe them. Our operational-quality event and relationship attribute labeler enables Warfighters and analysts to more thoroughly exploit information in unstructured text. This is made possible through 1) More precise event and relationship interpretation, 2) More detailed information about extracted events and relationships, and 3) More reliable and informative entity networks that acknowledge the different attributes of entity-entity relationships.
Structuring and extracting knowledge for the support of hypothesis generation in molecular biology
Roos, Marco; Marshall, M Scott; Gibson, Andrew P; Schuemie, Martijn; Meij, Edgar; Katrenko, Sophia; van Hage, Willem Robert; Krommydas, Konstantinos; Adriaans, Pieter W
2009-01-01
Background Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation. PMID:19796406
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ievlev, Anton V.; Belianinov, Alexei; Jesse, Stephen
Time of flight secondary ion mass spectrometry (ToF SIMS) is one of the most powerful characterization tools allowing imaging of the chemical properties of various systems and materials. It allows precise studies of the chemical composition with sub-100-nm lateral and nanometer depth spatial resolution. However, comprehensive interpretation of ToF SIMS results is challengeable, because of the data volume and its multidimensionality. Furthermore, investigation of the samples with pronounced topographical features are complicated by the spectral shift. In this work we developed approach for the comprehensive ToF SIMS data interpretation based on the data analytics and automated extraction of the samplemore » topography based on time of flight shift. We further applied this approach to investigate correlation between biological function and chemical composition in Arabidopsis roots.« less
Ievlev, Anton V.; Belianinov, Alexei; Jesse, Stephen; ...
2017-12-06
Time of flight secondary ion mass spectrometry (ToF SIMS) is one of the most powerful characterization tools allowing imaging of the chemical properties of various systems and materials. It allows precise studies of the chemical composition with sub-100-nm lateral and nanometer depth spatial resolution. However, comprehensive interpretation of ToF SIMS results is challengeable, because of the data volume and its multidimensionality. Furthermore, investigation of the samples with pronounced topographical features are complicated by the spectral shift. In this work we developed approach for the comprehensive ToF SIMS data interpretation based on the data analytics and automated extraction of the samplemore » topography based on time of flight shift. We further applied this approach to investigate correlation between biological function and chemical composition in Arabidopsis roots.« less
[Establishment of Automation System for Detection of Alcohol in Blood].
Tian, L L; Shen, Lei; Xue, J F; Liu, M M; Liang, L J
2017-02-01
To establish an automation system for detection of alcohol content in blood. The determination was performed by automated workstation of extraction-headspace gas chromatography (HS-GC). The blood collection with negative pressure, sealing time of headspace bottle and sample needle were checked and optimized in the abstraction of automation system. The automatic sampling was compared with the manual sampling. The quantitative data obtained by the automated workstation of extraction-HS-GC for alcohol was stable. The relative differences of two parallel samples were less than 5%. The automated extraction was superior to the manual extraction. A good linear relationship was obtained at the alcohol concentration range of 0.1-3.0 mg/mL ( r ≥0.999) with good repeatability. The method is simple and quick, with more standard experiment process and accurate experimental data. It eliminates the error from the experimenter and has good repeatability, which can be applied to the qualitative and quantitative detections of alcohol in blood. Copyright© by the Editorial Department of Journal of Forensic Medicine
Automated detection and location of indications in eddy current signals
Brudnoy, David M.; Oppenlander, Jane E.; Levy, Arthur J.
2000-01-01
A computer implemented information extraction process that locates and identifies eddy current signal features in digital point-ordered signals, signals representing data from inspection of test materials, by enhancing the signal features relative to signal noise, detecting features of the signals, verifying the location of the signal features that can be known in advance, and outputting information about the identity and location of all detected signal features.
NASA Astrophysics Data System (ADS)
Skersys, Tomas; Butleris, Rimantas; Kapocius, Kestutis
2013-10-01
Approaches for the analysis and specification of business vocabularies and rules are very relevant topics in both Business Process Management and Information Systems Development disciplines. However, in common practice of Information Systems Development, the Business modeling activities still are of mostly empiric nature. In this paper, basic aspects of the approach for business vocabularies' semi-automated extraction from business process models are presented. The approach is based on novel business modeling-level OMG standards "Business Process Model and Notation" (BPMN) and "Semantics for Business Vocabularies and Business Rules" (SBVR), thus contributing to OMG's vision about Model-Driven Architecture (MDA) and to model-driven development in general.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kertesz, Vilmos; Van Berkel, Gary J
A fully automated liquid extraction-based surface sampling system utilizing a commercially available autosampler coupled to high performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) detection is reported. Discrete spots selected for droplet-based sampling and automated sample queue generation for both the autosampler and MS were enabled by using in-house developed software. In addition, co-registration of spatially resolved sampling position and HPLC-MS information to generate heatmaps of compounds monitored for subsequent data analysis was also available in the software. The system was evaluated with whole-body thin tissue sections from propranolol dosed rat. The hands-free operation of the system was demonstrated by creating heatmapsmore » of the parent drug and its hydroxypropranolol glucuronide metabolites with 1 mm resolution in the areas of interest. The sample throughput was approximately 5 min/sample defined by the time needed for chromatographic separation. The spatial distributions of both the drug and its metabolites were consistent with previous studies employing other liquid extraction-based surface sampling methodologies.« less
Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports.
Qiu, John X; Yoon, Hong-Jun; Fearn, Paul A; Tourassi, Georgia D
2018-01-01
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. In this study, we investigated deep learning and a convolutional neural network (CNN), for extracting ICD-O-3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations as the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro- and macro-F score increases of up to 0.132 and 0.226, respectively, when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on the CNN method and cancer site. These encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.
Radiomics: Extracting more information from medical images using advanced feature analysis
Lambin, Philippe; Rios-Velazquez, Emmanuel; Leijenaar, Ralph; Carvalho, Sara; van Stiphout, Ruud G.P.M.; Granton, Patrick; Zegers, Catharina M.L.; Gillies, Robert; Boellard, Ronald; Dekker, André; Aerts, Hugo J.W.L.
2015-01-01
Solid cancers are spatially and temporally heterogeneous. This limits the use of invasive biopsy based molecular assays but gives huge potential for medical imaging, which has the ability to capture intra-tumoural heterogeneity in a non-invasive way. During the past decades, medical imaging innovations with new hardware, new imaging agents and standardised protocols, allows the field to move towards quantitative imaging. Therefore, also the development of automated and reproducible analysis methodologies to extract more information from image-based features is a requirement. Radiomics – the high-throughput extraction of large amounts of image features from radiographic images – addresses this problem and is one of the approaches that hold great promises but need further validation in multi-centric settings and in the laboratory. PMID:22257792
From data to information and knowledge for geospatial applications
NASA Astrophysics Data System (ADS)
Schenk, T.; Csatho, B.; Yoon, T.
2006-12-01
An ever-increasing number of airborne and spaceborne data-acquisition missions with various sensors produce a glut of data. Sensory data rarely contains information in a explicit form such that an application can directly use it. The processing and analyzing of data constitutes a real bottleneck; therefore, automating the processes of gaining useful information and knowledge from the raw data is of paramount interest. This presentation is concerned with the transition from data to information and knowledge. With data we refer to the sensor output and we notice that data provide very rarely direct answers for applications. For example, a pixel in a digital image or a laser point from a LIDAR system (data) have no direct relationship with elevation changes of topographic surfaces or the velocity of a glacier (information, knowledge). We propose to employ the computer vision paradigm to extract information and knowledge as it pertains to a wide range of geoscience applications. After introducing the paradigm we describe the major steps to be undertaken for extracting information and knowledge from sensory input data. Features play an important role in this process. Thus we focus on extracting features and their perceptual organization to higher order constructs. We demonstrate these concepts with imaging data and laser point clouds. The second part of the presentation addresses the problem of combining data obtained by different sensors. An absolute prerequisite for successful fusion is to establish a common reference frame. We elaborate on the concept of sensor invariant features that allow the registration of such disparate data sets as aerial/satellite imagery, 3D laser point clouds, and multi/hyperspectral imagery. Fusion takes place on the data level (sensor registration) and on the information level. We show how fusion increases the degree of automation for reconstructing topographic surfaces. Moreover, fused information gained from the three sensors results in a more abstract surface representation with a rich set of explicit surface information that can be readily used by an analyst for applications such as change detection.
The Adam and Eve Robot Scientists for the Automated Discovery of Scientific Knowledge
NASA Astrophysics Data System (ADS)
King, Ross
A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to better understand science, and to make scientific research more efficient. The Robot Scientist `Adam' was the first machine to autonomously discover scientific knowledge: both form and experimentally confirm novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist `Eve' was originally developed to automate early-stage drug development, with specific application to neglected tropical disease such as malaria, African sleeping sickness, etc. We are now adapting Eve to work with on cancer. We are also teaching Eve to autonomously extract information from the scientific literature.
We've Got Plenty of Data, Now How Can We Use It?
ERIC Educational Resources Information Center
Weiler, Jeffrey K.; Mears, Robert L.
1999-01-01
To mine a large store of school data, a new technology (variously termed data warehousing, data marts, online analytical processing, and executive information systems) is emerging. Data warehousing helps school districts extract and restructure desired data from automated systems and create new databases designed to enhance analytical and…
NASA Astrophysics Data System (ADS)
Albrecht, F.; Hölbling, D.; Friedl, B.
2017-09-01
Landslide mapping benefits from the ever increasing availability of Earth Observation (EO) data resulting from programmes like the Copernicus Sentinel missions and improved infrastructure for data access. However, there arises the need for improved automated landslide information extraction processes from EO data while the dominant method is still manual delineation. Object-based image analysis (OBIA) provides the means for the fast and efficient extraction of landslide information. To prove its quality, automated results are often compared to manually delineated landslide maps. Although there is awareness of the uncertainties inherent in manual delineations, there is a lack of understanding how they affect the levels of agreement in a direct comparison of OBIA-derived landslide maps and manually derived landslide maps. In order to provide an improved reference, we present a fuzzy approach for the manual delineation of landslides on optical satellite images, thereby making the inherent uncertainties of the delineation explicit. The fuzzy manual delineation and the OBIA classification are compared by accuracy metrics accepted in the remote sensing community. We have tested this approach for high resolution (HR) satellite images of three large landslides in Austria and Italy. We were able to show that the deviation of the OBIA result from the manual delineation can mainly be attributed to the uncertainty inherent in the manual delineation process, a relevant issue for the design of validation processes for OBIA-derived landslide maps.
Gene/protein name recognition based on support vector machine using dictionary as features.
Mitsumori, Tomohiro; Fation, Sevrani; Murata, Masaki; Doi, Kouichi; Doi, Hirohumi
2005-01-01
Automated information extraction from biomedical literature is important because a vast amount of biomedical literature has been published. Recognition of the biomedical named entities is the first step in information extraction. We developed an automated recognition system based on the SVM algorithm and evaluated it in Task 1.A of BioCreAtIvE, a competition for automated gene/protein name recognition. In the work presented here, our recognition system uses the feature set of the word, the part-of-speech (POS), the orthography, the prefix, the suffix, and the preceding class. We call these features "internal resource features", i.e., features that can be found in the training data. Additionally, we consider the features of matching against dictionaries to be external resource features. We investigated and evaluated the effect of these features as well as the effect of tuning the parameters of the SVM algorithm. We found that the dictionary matching features contributed slightly to the improvement in the performance of the f-score. We attribute this to the possibility that the dictionary matching features might overlap with other features in the current multiple feature setting. During SVM learning, each feature alone had a marginally positive effect on system performance. This supports the fact that the SVM algorithm is robust on the high dimensionality of the feature vector space and means that feature selection is not required.
Stangegaard, Michael; Hjort, Benjamin B; Hansen, Thomas N; Hoflund, Anders; Mogensen, Helle S; Hansen, Anders J; Morling, Niels
2013-05-01
The presence of PCR inhibitors in extracted DNA may interfere with the subsequent quantification and short tandem repeat (STR) reactions used in forensic genetic DNA typing. DNA extraction from fabric for forensic genetic purposes may be challenging due to the occasional presence of PCR inhibitors that may be co-extracted with the DNA. Using 120 forensic trace evidence samples consisting of various types of fabric, we compared three automated DNA extraction methods based on magnetic beads (PrepFiler Express Forensic DNA Extraction Kit on an AutoMate Express, QIAsyphony DNA Investigator kit either with the sample pre-treatment recommended by Qiagen or an in-house optimized sample pre-treatment on a QIAsymphony SP) and one manual method (Chelex) with the aim of reducing the amount of PCR inhibitors in the DNA extracts and increasing the proportion of reportable STR-profiles. A total of 480 samples were processed. The highest DNA recovery was obtained with the PrepFiler Express kit on an AutoMate Express while the lowest DNA recovery was obtained using a QIAsymphony SP with the sample pre-treatment recommended by Qiagen. Extraction using a QIAsymphony SP with the sample pre-treatment recommended by Qiagen resulted in the lowest percentage of PCR inhibition (0%) while extraction using manual Chelex resulted in the highest percentage of PCR inhibition (51%). The largest number of reportable STR-profiles was obtained with DNA from samples extracted with the PrepFiler Express kit (75%) while the lowest number was obtained with DNA from samples extracted using a QIAsymphony SP with the sample pre-treatment recommended by Qiagen (41%). Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Text mining and its potential applications in systems biology.
Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi
2006-12-01
With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.
PKDE4J: Entity and relation extraction for public knowledge discovery.
Song, Min; Kim, Won Chul; Lee, Dahee; Heo, Go Eun; Kang, Keun Young
2015-10-01
Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kemper, Thomas; Gueguen, Lionel; Soille, Pierre
2012-06-01
The enumeration of the population remains a critical task in the management of refugee/IDP camps. Analysis of very high spatial resolution satellite data proofed to be an efficient and secure approach for the estimation of dwellings and the monitoring of the camp over time. In this paper we propose a new methodology for the automated extraction of features based on differential morphological decomposition segmentation for feature extraction and interactive training sample selection from the max-tree and min-tree structures. This feature extraction methodology is tested on a WorldView-2 scene of an IDP camp in Darfur Sudan. Special emphasis is given to the additional available bands of the WorldView-2 sensor. The results obtained show that the interactive image information tool is performing very well by tuning the feature extraction to the local conditions. The analysis of different spectral subsets shows that it is possible to obtain good results already with an RGB combination, but by increasing the number of spectral bands the detection of dwellings becomes more accurate. Best results were obtained using all eight bands of WorldView-2 satellite.
Bacterial and fungal DNA extraction from blood samples: automated protocols.
Lorenz, Michael G; Disqué, Claudia; Mühl, Helge
2015-01-01
Automation in DNA isolation is a necessity for routine practice employing molecular diagnosis of infectious agents. To this end, the development of automated systems for the molecular diagnosis of microorganisms directly in blood samples is at its beginning. Important characteristics of systems demanded for routine use include high recovery of microbial DNA, DNA-free containment for the reduction of DNA contamination from exogenous sources, DNA-free reagents and consumables, ideally a walkaway system, and economical pricing of the equipment and consumables. Such full automation of DNA extraction evaluated and in use for sepsis diagnostics is yet not available. Here, we present protocols for the semiautomated isolation of microbial DNA from blood culture and low- and high-volume blood samples. The protocols include a manual pretreatment step followed by automated extraction and purification of microbial DNA.
Generating disease-pertinent treatment vocabularies from MEDLINE citations.
Wang, Liqin; Del Fiol, Guilherme; Bray, Bruce E; Haug, Peter J
2017-01-01
Healthcare communities have identified a significant need for disease-specific information. Disease-specific ontologies are useful in assisting the retrieval of disease-relevant information from various sources. However, building these ontologies is labor intensive. Our goal is to develop a system for an automated generation of disease-pertinent concepts from a popular knowledge resource for the building of disease-specific ontologies. A pipeline system was developed with an initial focus of generating disease-specific treatment vocabularies. It was comprised of the components of disease-specific citation retrieval, predication extraction, treatment predication extraction, treatment concept extraction, and relevance ranking. A semantic schema was developed to support the extraction of treatment predications and concepts. Four ranking approaches (i.e., occurrence, interest, degree centrality, and weighted degree centrality) were proposed to measure the relevance of treatment concepts to the disease of interest. We measured the performance of four ranks in terms of the mean precision at the top 100 concepts with five diseases, as well as the precision-recall curves against two reference vocabularies. The performance of the system was also compared to two baseline approaches. The pipeline system achieved a mean precision of 0.80 for the top 100 concepts with the ranking by interest. There were no significant different among the four ranks (p=0.53). However, the pipeline-based system had significantly better performance than the two baselines. The pipeline system can be useful for an automated generation of disease-relevant treatment concepts from the biomedical literature. Copyright © 2016 Elsevier Inc. All rights reserved.
Automated detection of qualitative spatio-temporal features in electrocardiac activation maps.
Ironi, Liliana; Tentoni, Stefania
2007-02-01
This paper describes a piece of work aiming at the realization of a tool for the automated interpretation of electrocardiac maps. Such maps can capture a number of electrical conduction pathologies, such as arrhytmia, that can be missed by the analysis of traditional electrocardiograms. But, their introduction into the clinical practice is still far away as their interpretation requires skills that belongs to very few experts. Then, an automated interpretation tool would bridge the gap between the established research outcome and clinical practice with a consequent great impact on health care. Qualitative spatial reasoning can play a crucial role in the identification of spatio-temporal patterns and salient features that characterize the heart electrical activity. We adopted the spatial aggregation (SA) conceptual framework and an interplay of numerical and qualitative information to extract features from epicardial maps, and to make them available for reasoning tasks. Our focus is on epicardial activation isochrone maps as they are a synthetic representation of spatio-temporal aspects of the propagation of the electrical excitation. We provide a computational SA-based methodology to extract, from 3D epicardial data gathered over time, (1) the excitation wavefront structure, and (2) the salient features that characterize wavefront propagation and visually correspond to specific geometric objects. The proposed methodology provides a robust and efficient way to identify salient pieces of information in activation time maps. The hierarchical structure of the abstracted geometric objects, crucial in capturing the prominent information, facilitates the definition of general rules necessary to infer the correlation between pathophysiological patterns and wavefront structure and propagation.
Automated Fluid Feature Extraction from Transient Simulations
NASA Technical Reports Server (NTRS)
Haimes, Robert
2000-01-01
In the past, feature extraction and identification were interesting concepts, but not required in understanding the physics of a steady flow field. This is because the results of the more traditional tools like iso-surfaces, cuts and streamlines, were more interactive and easily abstracted so they could be represented to the investigator. These tools worked and properly conveyed the collected information at the expense of a great deal of interaction. For unsteady flow-fields, the investigator does not have the luxury of spending time scanning only one 'snap-shot' of the simulation. Automated assistance is required in pointing out areas of potential interest contained within the flow. This must not require a heavy compute burden (the visualization should not significantly slow down the solution procedure for co-processing environments like pV3). And methods must be developed to abstract the feature and display it in a manner that physically makes sense.
A Risk Assessment System with Automatic Extraction of Event Types
NASA Astrophysics Data System (ADS)
Capet, Philippe; Delavallade, Thomas; Nakamura, Takuya; Sandor, Agnes; Tarsitano, Cedric; Voyatzi, Stavroula
In this article we describe the joint effort of experts in linguistics, information extraction and risk assessment to integrate EventSpotter, an automatic event extraction engine, into ADAC, an automated early warning system. By detecting as early as possible weak signals of emerging risks ADAC provides a dynamic synthetic picture of situations involving risk. The ADAC system calculates risk on the basis of fuzzy logic rules operated on a template graph whose leaves are event types. EventSpotter is based on a general purpose natural language dependency parser, XIP, enhanced with domain-specific lexical resources (Lexicon-Grammar). Its role is to automatically feed the leaves with input data.
McEntire, Robin; Szalkowski, Debbie; Butler, James; Kuo, Michelle S; Chang, Meiping; Chang, Man; Freeman, Darren; McQuay, Sarah; Patel, Jagruti; McGlashen, Michael; Cornell, Wendy D; Xu, Jinghai James
2016-05-01
External content sources such as MEDLINE(®), National Institutes of Health (NIH) grants and conference websites provide access to the latest breaking biomedical information, which can inform pharmaceutical and biotechnology company pipeline decisions. The value of the sites for industry, however, is limited by the use of the public internet, the limited synonyms, the rarity of batch searching capability and the disconnected nature of the sites. Fortunately, many sites now offer their content for download and we have developed an automated internal workflow that uses text mining and tailored ontologies for programmatic search and knowledge extraction. We believe such an efficient and secure approach provides a competitive advantage to companies needing access to the latest information for a range of use cases and complements manually curated commercial sources. Copyright © 2016. Published by Elsevier Ltd.
Data is presented on the development of a new automated system combining solid phase extraction (SPE) with GC/MS spectrometry for the single-run analysis of water samples containing a broad range of organic compounds. The system uses commercially available automated in-line 10-m...
Finding Relevant Data in a Sea of Languages
2016-04-26
full machine-translated text , unbiased word clouds , query-biased word clouds , and query-biased sentence...and information retrieval to automate language processing tasks so that the limited number of linguists available for analyzing text and spoken...the crime (stock market). The Cross-LAnguage Search Engine (CLASE) has already preprocessed the documents, extracting text to identify the language
Regan, John Frederick
2014-09-09
Removable cartridges are used on automated flow-through systems for the purpose of extracting and purifying genetic material from complex matrices. Different types of cartridges are paired with specific automated protocols to concentrate, extract, and purifying pathogenic or human genetic material. Their flow-through nature allows large quantities sample to be processed. Matrices may be filtered using size exclusion and/or affinity filters to concentrate the pathogen of interest. Lysed material is ultimately passed through a filter to remove the insoluble material before the soluble genetic material is delivered past a silica-like membrane that binds the genetic material, where it is washed, dried, and eluted. Cartridges are inserted into the housing areas of flow-through automated instruments, which are equipped with sensors to ensure proper placement and usage of the cartridges. Properly inserted cartridges create fluid- and air-tight seals with the flow lines of an automated instrument.
Integrating the Allen Brain Institute Cell Types Database into Automated Neuroscience Workflow.
Stockton, David B; Santamaria, Fidel
2017-10-01
We developed software tools to download, extract features, and organize the Cell Types Database from the Allen Brain Institute (ABI) in order to integrate its whole cell patch clamp characterization data into the automated modeling/data analysis cycle. To expand the potential user base we employed both Python and MATLAB. The basic set of tools downloads selected raw data and extracts cell, sweep, and spike features, using ABI's feature extraction code. To facilitate data manipulation we added a tool to build a local specialized database of raw data plus extracted features. Finally, to maximize automation, we extended our NeuroManager workflow automation suite to include these tools plus a separate investigation database. The extended suite allows the user to integrate ABI experimental and modeling data into an automated workflow deployed on heterogeneous computer infrastructures, from local servers, to high performance computing environments, to the cloud. Since our approach is focused on workflow procedures our tools can be modified to interact with the increasing number of neuroscience databases being developed to cover all scales and properties of the nervous system.
Automated video feature extraction : workshop summary report October 10-11 2012.
DOT National Transportation Integrated Search
2012-12-01
This report summarizes a 2-day workshop on automated video feature extraction. Discussion focused on the Naturalistic Driving : Study, funded by the second Strategic Highway Research Program, and also involved the companion roadway inventory dataset....
Automated extraction and semantic analysis of mutation impacts from the biomedical literature
2012-01-01
Background Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. Results We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. Conclusion We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions. PMID:22759648
Wilson, Richard A.; Chapman, Wendy W.; DeFries, Shawn J.; Becich, Michael J.; Chapman, Brian E.
2010-01-01
Background: Clinical records are often unstructured, free-text documents that create information extraction challenges and costs. Healthcare delivery and research organizations, such as the National Mesothelioma Virtual Bank, require the aggregation of both structured and unstructured data types. Natural language processing offers techniques for automatically extracting information from unstructured, free-text documents. Methods: Five hundred and eight history and physical reports from mesothelioma patients were split into development (208) and test sets (300). A reference standard was developed and each report was annotated by experts with regard to the patient’s personal history of ancillary cancer and family history of any cancer. The Hx application was developed to process reports, extract relevant features, perform reference resolution and classify them with regard to cancer history. Two methods, Dynamic-Window and ConText, for extracting information were evaluated. Hx’s classification responses using each of the two methods were measured against the reference standard. The average Cohen’s weighted kappa served as the human benchmark in evaluating the system. Results: Hx had a high overall accuracy, with each method, scoring 96.2%. F-measures using the Dynamic-Window and ConText methods were 91.8% and 91.6%, which were comparable to the human benchmark of 92.8%. For the personal history classification, Dynamic-Window scored highest with 89.2% and for the family history classification, ConText scored highest with 97.6%, in which both methods were comparable to the human benchmark of 88.3% and 97.2%, respectively. Conclusion: We evaluated an automated application’s performance in classifying a mesothelioma patient’s personal and family history of cancer from clinical reports. To do so, the Hx application must process reports, identify cancer concepts, distinguish the known mesothelioma from ancillary cancers, recognize negation, perform reference resolution and determine the experiencer. Results indicated that both information extraction methods tested were dependant on the domain-specific lexicon and negation extraction. We showed that the more general method, ConText, performed as well as our task-specific method. Although Dynamic- Window could be modified to retrieve other concepts, ConText is more robust and performs better on inconclusive concepts. Hx could greatly improve and expedite the process of extracting data from free-text, clinical records for a variety of research or healthcare delivery organizations. PMID:21031012
Deep Learning for Automated Extraction of Primary Sites from Cancer Pathology Reports
Qiu, John; Yoon, Hong-Jun; Fearn, Paul A.; ...
2017-05-03
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. Here in this study we investigated deep learning and a convolutional neural network (CNN), for extracting ICDO- 3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations asmore » the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro and macro-F score increases of up to 0.132 and 0.226 respectively when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on CNN method and cancer site. Finally, these encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.« less
Deep Learning for Automated Extraction of Primary Sites from Cancer Pathology Reports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qiu, John; Yoon, Hong-Jun; Fearn, Paul A.
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. Here in this study we investigated deep learning and a convolutional neural network (CNN), for extracting ICDO- 3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations asmore » the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro and macro-F score increases of up to 0.132 and 0.226 respectively when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on CNN method and cancer site. Finally, these encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.« less
Cook, Tessa S; Zimmerman, Stefan L; Steingall, Scott R; Maidment, Andrew D A; Kim, Woojin; Boonn, William W
2011-01-01
There is growing interest in the ability to monitor, track, and report exposure to radiation from medical imaging. Historically, however, dose information has been stored on an image-based dose sheet, an arrangement that precludes widespread indexing. Although scanner manufacturers are beginning to include dose-related parameters in the Digital Imaging and Communications in Medicine (DICOM) headers of imaging studies, there remains a vast repository of retrospective computed tomographic (CT) data with image-based dose sheets. Consequently, it is difficult for imaging centers to monitor their dose estimates or participate in the American College of Radiology (ACR) Dose Index Registry. An automated extraction software pipeline known as Radiation Dose Intelligent Analytics for CT Examinations (RADIANCE) has been designed that quickly and accurately parses CT dose sheets to extract and archive dose-related parameters. Optical character recognition of information in the dose sheet leads to creation of a text file, which along with the DICOM study header is parsed to extract dose-related data. The data are then stored in a relational database that can be queried for dose monitoring and report creation. RADIANCE allows efficient dose analysis of CT examinations and more effective education of technologists, radiologists, and referring physicians regarding patient exposure to radiation at CT. RADIANCE also allows compliance with the ACR's dose reporting guidelines and greater awareness of patient radiation dose, ultimately resulting in improved patient care and treatment.
Sahore, Vishal; Sonker, Mukul; Nielsen, Anna V; Knob, Radim; Kumar, Suresh; Woolley, Adam T
2018-01-01
We have developed multichannel integrated microfluidic devices for automated preconcentration, labeling, purification, and separation of preterm birth (PTB) biomarkers. We fabricated multilayer poly(dimethylsiloxane)-cyclic olefin copolymer (PDMS-COC) devices that perform solid-phase extraction (SPE) and microchip electrophoresis (μCE) for automated PTB biomarker analysis. The PDMS control layer had a peristaltic pump and pneumatic valves for flow control, while the PDMS fluidic layer had five input reservoirs connected to microchannels and a μCE system. The COC layers had a reversed-phase octyl methacrylate porous polymer monolith for SPE and fluorescent labeling of PTB biomarkers. We determined μCE conditions for two PTB biomarkers, ferritin (Fer) and corticotropin-releasing factor (CRF). We used these integrated microfluidic devices to preconcentrate and purify off-chip-labeled Fer and CRF in an automated fashion. Finally, we performed a fully automated on-chip analysis of unlabeled PTB biomarkers, involving SPE, labeling, and μCE separation with 1 h total analysis time. These integrated systems have strong potential to be combined with upstream immunoaffinity extraction, offering a compact sample-to-answer biomarker analysis platform. Graphical abstract Pressure-actuated integrated microfluidic devices have been developed for automated solid-phase extraction, fluorescent labeling, and microchip electrophoresis of preterm birth biomarkers.
USDA-ARS?s Scientific Manuscript database
This study demonstrated the application of an automated high-throughput mini-cartridge solid-phase extraction (mini-SPE) cleanup for the rapid low-pressure gas chromatography – tandem mass spectrometry (LPGC-MS/MS) analysis of pesticides and environmental contaminants in QuEChERS extracts of foods. ...
Information extraction for enhanced access to disease outbreak reports.
Grishman, Ralph; Huttunen, Silja; Yangarber, Roman
2002-08-01
Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.
Inferring the most probable maps of underground utilities using Bayesian mapping model
NASA Astrophysics Data System (ADS)
Bilal, Muhammad; Khan, Wasiq; Muggleton, Jennifer; Rustighi, Emiliano; Jenks, Hugo; Pennock, Steve R.; Atkins, Phil R.; Cohn, Anthony
2018-03-01
Mapping the Underworld (MTU), a major initiative in the UK, is focused on addressing social, environmental and economic consequences raised from the inability to locate buried underground utilities (such as pipes and cables) by developing a multi-sensor mobile device. The aim of MTU device is to locate different types of buried assets in real time with the use of automated data processing techniques and statutory records. The statutory records, even though typically being inaccurate and incomplete, provide useful prior information on what is buried under the ground and where. However, the integration of information from multiple sensors (raw data) with these qualitative maps and their visualization is challenging and requires the implementation of robust machine learning/data fusion approaches. An approach for automated creation of revised maps was developed as a Bayesian Mapping model in this paper by integrating the knowledge extracted from sensors raw data and available statutory records. The combination of statutory records with the hypotheses from sensors was for initial estimation of what might be found underground and roughly where. The maps were (re)constructed using automated image segmentation techniques for hypotheses extraction and Bayesian classification techniques for segment-manhole connections. The model consisting of image segmentation algorithm and various Bayesian classification techniques (segment recognition and expectation maximization (EM) algorithm) provided robust performance on various simulated as well as real sites in terms of predicting linear/non-linear segments and constructing refined 2D/3D maps.
Smart Extraction and Analysis System for Clinical Research.
Afzal, Muhammad; Hussain, Maqbool; Khan, Wajahat Ali; Ali, Taqdir; Jamshed, Arif; Lee, Sungyoung
2017-05-01
With the increasing use of electronic health records (EHRs), there is a growing need to expand the utilization of EHR data to support clinical research. The key challenge in achieving this goal is the unavailability of smart systems and methods to overcome the issue of data preparation, structuring, and sharing for smooth clinical research. We developed a robust analysis system called the smart extraction and analysis system (SEAS) that consists of two subsystems: (1) the information extraction system (IES), for extracting information from clinical documents, and (2) the survival analysis system (SAS), for a descriptive and predictive analysis to compile the survival statistics and predict the future chance of survivability. The IES subsystem is based on a novel permutation-based pattern recognition method that extracts information from unstructured clinical documents. Similarly, the SAS subsystem is based on a classification and regression tree (CART)-based prediction model for survival analysis. SEAS is evaluated and validated on a real-world case study of head and neck cancer. The overall information extraction accuracy of the system for semistructured text is recorded at 99%, while that for unstructured text is 97%. Furthermore, the automated, unstructured information extraction has reduced the average time spent on manual data entry by 75%, without compromising the accuracy of the system. Moreover, around 88% of patients are found in a terminal or dead state for the highest clinical stage of disease (level IV). Similarly, there is an ∼36% probability of a patient being alive if at least one of the lifestyle risk factors was positive. We presented our work on the development of SEAS to replace costly and time-consuming manual methods with smart automatic extraction of information and survival prediction methods. SEAS has reduced the time and energy of human resources spent unnecessarily on manual tasks.
Norouzzadeh, Mohammad Sadegh; Nguyen, Anh; Kosmala, Margaret; Swanson, Alexandra; Palmer, Meredith S; Packer, Craig; Clune, Jeff
2018-06-19
Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into "big data" sciences. Motion-sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild. Copyright © 2018 the Author(s). Published by PNAS.
Blastocyst microinjection automation.
Mattos, Leonardo S; Grant, Edward; Thresher, Randy; Kluckman, Kimberly
2009-09-01
Blastocyst microinjections are routinely involved in the process of creating genetically modified mice for biomedical research, but their efficiency is highly dependent on the skills of the operators. As a consequence, much time and resources are required for training microinjection personnel. This situation has been aggravated by the rapid growth of genetic research, which has increased the demand for mutant animals. Therefore, increased productivity and efficiency in this area are highly desired. Here, we pursue these goals through the automation of a previously developed teleoperated blastocyst microinjection system. This included the design of a new system setup to facilitate automation, the definition of rules for automatic microinjections, the implementation of video processing algorithms to extract feedback information from microscope images, and the creation of control algorithms for process automation. Experimentation conducted with this new system and operator assistance during the cells delivery phase demonstrated a 75% microinjection success rate. In addition, implantation of the successfully injected blastocysts resulted in a 53% birth rate and a 20% yield of chimeras. These results proved that the developed system was capable of automatic blastocyst penetration and retraction, demonstrating the success of major steps toward full process automation.
Three treatment media, used for the removal of arsenic from drinking water, were sequentially extracted using 10mM MgCl2 (pH 8), 10mM NaH2PO4 (pH 7) followed by 10mM (NH4)2C2O4 (pH 3). The media were extracted using an on-line automated continuous extraction system which allowed...
Evaluation of four automated protocols for extraction of DNA from FTA cards.
Stangegaard, Michael; Børsting, Claus; Ferrero-Miliani, Laura; Frank-Hansen, Rune; Poulsen, Lena; Hansen, Anders J; Morling, Niels
2013-10-01
Extraction of DNA using magnetic bead-based techniques on automated DNA extraction instruments provides a fast, reliable, and reproducible method for DNA extraction from various matrices. Here, we have compared the yield and quality of DNA extracted from FTA cards using four automated extraction protocols on three different instruments. The extraction processes were repeated up to six times with the same pieces of FTA cards. The sample material on the FTA cards was either blood or buccal cells. With the QIAamp DNA Investigator and QIAsymphony DNA Investigator kits, it was possible to extract DNA from the FTA cards in all six rounds of extractions in sufficient amount and quality to obtain complete short tandem repeat (STR) profiles on a QIAcube and a QIAsymphony SP. With the PrepFiler Express kit, almost all the extractable DNA was extracted in the first two rounds of extractions. Furthermore, we demonstrated that it was possible to successfully extract sufficient DNA for STR profiling from previously processed FTA card pieces that had been stored at 4 °C for up to 1 year. This showed that rare or precious FTA card samples may be saved for future analyses even though some DNA was already extracted from the FTA cards.
Multilingual Information Retrieval in Thoracic Radiology: Feasibility Study
Castilla, André Coutinho; Furuie, Sérgio Shiguemi; Mendonça, Eneida A.
2014-01-01
Most of essential information contained on Electronic Medical Record is stored as text, imposing several difficulties on automated data extraction and retrieval. Natural language processing is an approach that can unlock clinical information from free texts. The proposed methodology uses the specialized natural language processor MEDLEE developed for English language. To use this processor on Portuguese medical texts, chest x-ray reports were Machine Translated into English. The result of serial coupling of MT an NLP is tagged text which needs further investigation for extracting clinical findings. The objective of this experiment was to investigate normal reports and reports with device description on a set of 165 chest x-ray reports. We obtained sensitivity and specificity of 1 and 0.71 for the first condition and 0.97 and 0.97 for the second respectively. The reference was formed by the opinion of two radiologists. The results of this experiment indicate the viability of extracting clinical findings from chest x-ray reports through coupling MT and NLP. PMID:17911745
Automated systems to identify relevant documents in product risk management
2012-01-01
Background Product risk management involves critical assessment of the risks and benefits of health products circulating in the market. One of the important sources of safety information is the primary literature, especially for newer products which regulatory authorities have relatively little experience with. Although the primary literature provides vast and diverse information, only a small proportion of which is useful for product risk assessment work. Hence, the aim of this study is to explore the possibility of using text mining to automate the identification of useful articles, which will reduce the time taken for literature search and hence improving work efficiency. In this study, term-frequency inverse document-frequency values were computed for predictors extracted from the titles and abstracts of articles related to three tumour necrosis factors-alpha blockers. A general automated system was developed using only general predictors and was tested for its generalizability using articles related to four other drug classes. Several specific automated systems were developed using both general and specific predictors and training sets of different sizes in order to determine the minimum number of articles required for developing such systems. Results The general automated system had an area under the curve value of 0.731 and was able to rank 34.6% and 46.2% of the total number of 'useful' articles among the first 10% and 20% of the articles presented to the evaluators when tested on the generalizability set. However, its use may be limited by the subjective definition of useful articles. For the specific automated system, it was found that only 20 articles were required to develop a specific automated system with a prediction performance (AUC 0.748) that was better than that of general automated system. Conclusions Specific automated systems can be developed rapidly and avoid problems caused by subjective definition of useful articles. Thus the efficiency of product risk management can be improved with the use of specific automated systems. PMID:22380483
Automatic Requirements Specification Extraction from Natural Language (ARSENAL)
2014-10-01
designers, implementers) involved in the design of software systems. However, natural language descriptions can be informal, incomplete, imprecise...communication of technical descriptions between the various stakeholders (e.g., customers, designers, imple- menters) involved in the design of software systems...the accuracy of the natural language processing stage, the degree of automation, and robustness to noise. 1 2 Introduction Software systems operate in
Investigation of automated feature extraction using multiple data sources
NASA Astrophysics Data System (ADS)
Harvey, Neal R.; Perkins, Simon J.; Pope, Paul A.; Theiler, James P.; David, Nancy A.; Porter, Reid B.
2003-04-01
An increasing number and variety of platforms are now capable of collecting remote sensing data over a particular scene. For many applications, the information available from any individual sensor may be incomplete, inconsistent or imprecise. However, other sources may provide complementary and/or additional data. Thus, for an application such as image feature extraction or classification, it may be that fusing the mulitple data sources can lead to more consistent and reliable results. Unfortunately, with the increased complexity of the fused data, the search space of feature-extraction or classification algorithms also greatly increases. With a single data source, the determination of a suitable algorithm may be a significant challenge for an image analyst. With the fused data, the search for suitable algorithms can go far beyond the capabilities of a human in a realistic time frame, and becomes the realm of machine learning, where the computational power of modern computers can be harnessed to the task at hand. We describe experiments in which we investigate the ability of a suite of automated feature extraction tools developed at Los Alamos National Laboratory to make use of multiple data sources for various feature extraction tasks. We compare and contrast this software's capabilities on 1) individual data sets from different data sources 2) fused data sets from multiple data sources and 3) fusion of results from multiple individual data sources.
Context-based automated defect classification system using multiple morphological masks
Gleason, Shaun S.; Hunt, Martin A.; Sari-Sarraf, Hamed
2002-01-01
Automatic detection of defects during the fabrication of semiconductor wafers is largely automated, but the classification of those defects is still performed manually by technicians. This invention includes novel digital image analysis techniques that generate unique feature vector descriptions of semiconductor defects as well as classifiers that use these descriptions to automatically categorize the defects into one of a set of pre-defined classes. Feature extraction techniques based on multiple-focus images, multiple-defect mask images, and segmented semiconductor wafer images are used to create unique feature-based descriptions of the semiconductor defects. These feature-based defect descriptions are subsequently classified by a defect classifier into categories that depend on defect characteristics and defect contextual information, that is, the semiconductor process layer(s) with which the defect comes in contact. At the heart of the system is a knowledge database that stores and distributes historical semiconductor wafer and defect data to guide the feature extraction and classification processes. In summary, this invention takes as its input a set of images containing semiconductor defect information, and generates as its output a classification for the defect that describes not only the defect itself, but also the location of that defect with respect to the semiconductor process layers.
Sigoillot, Frederic D; Huckins, Jeremy F; Li, Fuhai; Zhou, Xiaobo; Wong, Stephen T C; King, Randall W
2011-01-01
Automated time-lapse microscopy can visualize proliferation of large numbers of individual cells, enabling accurate measurement of the frequency of cell division and the duration of interphase and mitosis. However, extraction of quantitative information by manual inspection of time-lapse movies is too time-consuming to be useful for analysis of large experiments. Here we present an automated time-series approach that can measure changes in the duration of mitosis and interphase in individual cells expressing fluorescent histone 2B. The approach requires analysis of only 2 features, nuclear area and average intensity. Compared to supervised learning approaches, this method reduces processing time and does not require generation of training data sets. We demonstrate that this method is as sensitive as manual analysis in identifying small changes in interphase or mitotic duration induced by drug or siRNA treatment. This approach should facilitate automated analysis of high-throughput time-lapse data sets to identify small molecules or gene products that influence timing of cell division.
Stangegaard, Michael; Frøslev, Tobias G; Frank-Hansen, Rune; Hansen, Anders J; Morling, Niels
2011-04-01
We have implemented and validated automated protocols for DNA extraction and PCR setup using a Tecan Freedom EVO liquid handler mounted with the Te-MagS magnetic separation device (Tecan, Männedorf, Switzerland). The protocols were validated for accredited forensic genetic work according to ISO 17025 using the Qiagen MagAttract DNA Mini M48 kit (Qiagen GmbH, Hilden, Germany) from fresh whole blood and blood from deceased individuals. The workflow was simplified by returning the DNA extracts to the original tubes minimizing the risk of misplacing samples. The tubes that originally contained the samples were washed with MilliQ water before the return of the DNA extracts. The PCR was setup in 96-well microtiter plates. The methods were validated for the kits: AmpFℓSTR Identifiler, SGM Plus and Yfiler (Applied Biosystems, Foster City, CA), GenePrint FFFL and PowerPlex Y (Promega, Madison, WI). The automated protocols allowed for extraction and addition of PCR master mix of 96 samples within 3.5h. In conclusion, we demonstrated that (1) DNA extraction with magnetic beads and (2) PCR setup for accredited, forensic genetic short tandem repeat typing can be implemented on a simple automated liquid handler leading to the reduction of manual work, and increased quality and throughput. Copyright © 2011 Society for Laboratory Automation and Screening. Published by Elsevier Inc. All rights reserved.
Automated solid-phase extraction and liquid chromatography for assay of cyclosporine in whole blood.
Kabra, P M; Wall, J H; Dimson, P
1987-12-01
In this rapid, precise, accurate, cost-effective, automated liquid-chromatographic procedure for determining cyclosporine in whole blood, the cyclosporine is extracted from 0.5 mL of whole blood together with 300 micrograms of cyclosporin D per liter, added as internal standard, by using an Advanced Automated Sample Processing unit. The on-line solid-phase extraction is performed on an octasilane sorbent cartridge, which is interfaced with a RP-8 guard column and an octyl analytical column, packed with 5-microns packing material. Both columns are eluted with a mobile phase containing acetonitrile/methanol/water (53/20/27 by vol) at a flow rate of 1.5 mL/min and column temperature of 70 degrees C. Absolute recovery of cyclosporine exceeded 85% and the standard curve was linear to 5000 micrograms/L. Within-run and day-to-day CVs were less than 8%. Correlation between automated and manual Bond-Elut extraction methods was excellent (r = 0.987). None of 18 drugs and four steroids tested interfered.
Supporting the Growing Needs of the GIS Industry
NASA Technical Reports Server (NTRS)
2003-01-01
Visual Learning Systems, Inc. (VLS), of Missoula, Montana, has developed a commercial software application called Feature Analyst. Feature Analyst was conceived under a Small Business Innovation Research (SBIR) contract with NASA's Stennis Space Center, and through the Montana State University TechLink Center, an organization funded by NASA and the U.S. Department of Defense to link regional companies with Federal laboratories for joint research and technology transfer. The software provides a paradigm shift to automated feature extraction, as it utilizes spectral, spatial, temporal, and ancillary information to model the feature extraction process; presents the ability to remove clutter; incorporates advanced machine learning techniques to supply unparalleled levels of accuracy; and includes an exceedingly simple interface for feature extraction.
A Hybrid Human-Computer Approach to the Extraction of Scientific Facts from the Literature.
Tchoua, Roselyne B; Chard, Kyle; Audus, Debra; Qin, Jian; de Pablo, Juan; Foster, Ian
2016-01-01
A wealth of valuable data is locked within the millions of research articles published each year. Reading and extracting pertinent information from those articles has become an unmanageable task for scientists. This problem hinders scientific progress by making it hard to build on results buried in literature. Moreover, these data are loosely structured, encoded in manuscripts of various formats, embedded in different content types, and are, in general, not machine accessible. We present a hybrid human-computer solution for semi-automatically extracting scientific facts from literature. This solution combines an automated discovery, download, and extraction phase with a semi-expert crowd assembled from students to extract specific scientific facts. To evaluate our approach we apply it to a challenging molecular engineering scenario, extraction of a polymer property: the Flory-Huggins interaction parameter. We demonstrate useful contributions to a comprehensive database of polymer properties.
Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection
NASA Astrophysics Data System (ADS)
Safi’ie, M. A.; Utami, E.; Fatta, H. A.
2018-03-01
Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.
Atkinson, Jonathan A; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E; Griffiths, Marcus; Wells, Darren M
2017-10-01
Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. © The Authors 2017. Published by Oxford University Press.
Atkinson, Jonathan A.; Lobet, Guillaume; Noll, Manuel; Meyer, Patrick E.; Griffiths, Marcus
2017-01-01
Abstract Genetic analyses of plant root systems require large datasets of extracted architectural traits. To quantify such traits from images of root systems, researchers often have to choose between automated tools (that are prone to error and extract only a limited number of architectural traits) or semi-automated ones (that are highly time consuming). We trained a Random Forest algorithm to infer architectural traits from automatically extracted image descriptors. The training was performed on a subset of the dataset, then applied to its entirety. This strategy allowed us to (i) decrease the image analysis time by 73% and (ii) extract meaningful architectural traits based on image descriptors. We also show that these traits are sufficient to identify the quantitative trait loci that had previously been discovered using a semi-automated method. We have shown that combining semi-automated image analysis with machine learning algorithms has the power to increase the throughput of large-scale root studies. We expect that such an approach will enable the quantification of more complex root systems for genetic studies. We also believe that our approach could be extended to other areas of plant phenotyping. PMID:29020748
Recognition techniques for extracting information from semistructured documents
NASA Astrophysics Data System (ADS)
Della Ventura, Anna; Gagliardi, Isabella; Zonta, Bruna
2000-12-01
Archives of optical documents are more and more massively employed, the demand driven also by the new norms sanctioning the legal value of digital documents, provided they are stored on supports that are physically unalterable. On the supply side there is now a vast and technologically advanced market, where optical memories have solved the problem of the duration and permanence of data at costs comparable to those for magnetic memories. The remaining bottleneck in these systems is the indexing. The indexing of documents with a variable structure, while still not completely automated, can be machine supported to a large degree with evident advantages both in the organization of the work, and in extracting information, providing data that is much more detailed and potentially significant for the user. We present here a system for the automatic registration of correspondence to and from a public office. The system is based on a general methodology for the extraction, indexing, archiving, and retrieval of significant information from semi-structured documents. This information, in our prototype application, is distributed among the database fields of sender, addressee, subject, date, and body of the document.
Pediatric Brain Extraction Using Learning-based Meta-algorithm
Shi, Feng; Wang, Li; Dai, Yakang; Gilmore, John H.; Lin, Weili; Shen, Dinggang
2012-01-01
Magnetic resonance imaging of pediatric brain provides valuable information for early brain development studies. Automated brain extraction is challenging due to the small brain size and dynamic change of tissue contrast in the developing brains. In this paper, we propose a novel Learning Algorithm for Brain Extraction and Labeling (LABEL) specially for the pediatric MR brain images. The idea is to perform multiple complementary brain extractions on a given testing image by using a meta-algorithm, including BET and BSE, where the parameters of each run of the meta-algorithm are effectively learned from the training data. Also, the representative subjects are selected as exemplars and used to guide brain extraction of new subjects in different age groups. We further develop a level-set based fusion method to combine multiple brain extractions together with a closed smooth surface for obtaining the final extraction. The proposed method has been extensively evaluated in subjects of three representative age groups, such as neonate (less than 2 months), infant (1–2 years), and child (5–18 years). Experimental results show that, with 45 subjects for training (15 neonates, 15 infant, and 15 children), the proposed method can produce more accurate brain extraction results on 246 testing subjects (75 neonates, 126 infants, and 45 children), i.e., at average Jaccard Index of 0.953, compared to those by BET (0.918), BSE (0.902), ROBEX (0.901), GCUT (0.856), and other fusion methods such as Majority Voting (0.919) and STAPLE (0.941). Along with the largely-improved computational efficiency, the proposed method demonstrates its ability of automated brain extraction for pediatric MR images in a large age range. PMID:22634859
Bachman, John A; Gyori, Benjamin M; Sorger, Peter K
2018-06-28
For automated reading of scientific publications to extract useful information about molecular mechanisms it is critical that genes, proteins and other entities be correctly associated with uniform identifiers, a process known as named entity linking or "grounding." Correct grounding is essential for resolving relationships among mined information, curated interaction databases, and biological datasets. The accuracy of this process is largely dependent on the availability of machine-readable resources associating synonyms and abbreviations commonly found in biomedical literature with uniform identifiers. In a task involving automated reading of ∼215,000 articles using the REACH event extraction software we found that grounding was disproportionately inaccurate for multi-protein families (e.g., "AKT") and complexes with multiple subunits (e.g."NF- κB"). To address this problem we constructed FamPlex, a manually curated resource defining protein families and complexes as they are commonly encountered in biomedical text. In FamPlex the gene-level constituents of families and complexes are defined in a flexible format allowing for multi-level, hierarchical membership. To create FamPlex, text strings corresponding to entities were identified empirically from literature and linked manually to uniform identifiers; these identifiers were also mapped to equivalent entries in multiple related databases. FamPlex also includes curated prefix and suffix patterns that improve named entity recognition and event extraction. Evaluation of REACH extractions on a test corpus of ∼54,000 articles showed that FamPlex significantly increased grounding accuracy for families and complexes (from 15 to 71%). The hierarchical organization of entities in FamPlex also made it possible to integrate otherwise unconnected mechanistic information across families, subfamilies, and individual proteins. Applications of FamPlex to the TRIPS/DRUM reading system and the Biocreative VI Bioentity Normalization Task dataset demonstrated the utility of FamPlex in other settings. FamPlex is an effective resource for improving named entity recognition, grounding, and relationship resolution in automated reading of biomedical text. The content in FamPlex is available in both tabular and Open Biomedical Ontology formats at https://github.com/sorgerlab/famplex under the Creative Commons CC0 license and has been integrated into the TRIPS/DRUM and REACH reading systems.
Deans, Katherine J; Minneci, Peter C; Nacion, Kristine M; Leonhart, Karen; Cooper, Jennifer N; Scholle, Sarah Hudson; Kelleher, Kelly J
2018-02-22
Preventive quality measures for the foster care population are largely untested. The objective of the study is to identify healthcare quality measures for young children and adolescents in foster care and to test whether the data required to calculate these measures can be feasibly extracted and interpreted within an electronic health records or within the Statewide Automated Child Welfare Information System. The AAP Recommendations for Preventive Pediatric Health Care served as the guideline for determining quality measures. Quality measures related to well child visits, developmental screenings, immunizations, trauma-related care, BMI measurements, sexually transmitted infections and depression were defined. Retrospective chart reviews were performed on a cohort of children in foster care from a single large pediatric institution and related county. Data available in the Ohio Statewide Automated Child Welfare Information System was compared to the same population studied in the electronic health record review. Quality measures were calculated as observed (received) to expected (recommended) ratios (O/E ratios) to describe the actual quantity of recommended health care that was received by individual children. Electronic health records and the Statewide Automated Child Welfare Information System data frequently lacked important information on foster care youth essential for calculating the measures. Although electronic health records were rich in encounter specific clinical data, they often lacked custodial information such as the dates of entry into and exit from foster care. In contrast, Statewide Automated Child Welfare Information System included robust data on custodial arrangements, but lacked detailed medical information. Despite these limitations, several quality measures were devised that attempted to accommodate these limitations. In this feasibility testing, neither the electronic health records at a single institution nor the county level Statewide Automated Child Welfare Information System was able to independently serve as a reliable source of data for health care quality measures for foster care youth. However, the ability to leverage both sources by matching them at an individual level may provide the complement of data necessary to assess the quality of healthcare.
Valero, E; Sanz, J; Martínez-Castro, I
2001-06-01
Direct thermal desorption (DTD) has been used as a technique for extracting volatile components of cheese as a preliminary step to their gas chromatographic (GC) analysis. In this study, it is applied to different cheese varieties: Camembert, blue, Chaumes, and La Serena. Volatiles are also extracted using other techniques such as simultaneous distillation-extraction and dynamic headspace. Separation and identification of the cheese components are carried out by GC-mass spectrometry. Approximately 100 compounds are detected in the examined cheeses. The described results show that DTD is fast, simple, and easy to automate; requires only a small amount of sample (approximately 50 mg); and affords quantitative information about the main groups of compounds present in cheeses.
Tchoua, Roselyne B; Qin, Jian; Audus, Debra J; Chard, Kyle; Foster, Ian T; de Pablo, Juan
2016-09-13
Structured databases of chemical and physical properties play a central role in the everyday research activities of scientists and engineers. In materials science, researchers and engineers turn to these databases to quickly query, compare, and aggregate various properties, thereby allowing for the development or application of new materials. The vast majority of these databases have been generated manually, through decades of labor-intensive harvesting of information from the literature; yet, while there are many examples of commonly used databases, a significant number of important properties remain locked within the tables, figures, and text of publications. The question addressed in our work is whether, and to what extent, the process of data collection can be automated. Students of the physical sciences and engineering are often confronted with the challenge of finding and applying property data from the literature, and a central aspect of their education is to develop the critical skills needed to identify such data and discern their meaning or validity. To address shortcomings associated with automated information extraction, while simultaneously preparing the next generation of scientists for their future endeavors, we developed a novel course-based approach in which students develop skills in polymer chemistry and physics and apply their knowledge by assisting with the semi-automated creation of a thermodynamic property database.
Suominen, Hanna; Johnson, Maree; Zhou, Liyuan; Sanchez, Paula; Sirel, Raul; Basilakis, Jim; Hanlen, Leif; Estival, Dominique; Dawson, Linda; Kelly, Barbara
2015-01-01
Objective We study the use of speech recognition and information extraction to generate drafts of Australian nursing-handover documents. Methods Speech recognition correctness and clinicians’ preferences were evaluated using 15 recorder–microphone combinations, six documents, three speakers, Dragon Medical 11, and five survey/interview participants. Information extraction correctness evaluation used 260 documents, six-class classification for each word, two annotators, and the CRF++ conditional random field toolkit. Results A noise-cancelling lapel-microphone with a digital voice recorder gave the best correctness (79%). This microphone was also the most preferred option by all but one participant. Although the participants liked the small size of this recorder, their preference was for tablets that can also be used for document proofing and sign-off, among other tasks. Accented speech was harder to recognize than native language and a male speaker was detected better than a female speaker. Information extraction was excellent in filtering out irrelevant text (85% F1) and identifying text relevant to two classes (87% and 70% F1). Similarly to the annotators’ disagreements, there was confusion between the remaining three classes, which explains the modest 62% macro-averaged F1. Discussion We present evidence for the feasibility of speech recognition and information extraction to support clinicians’ in entering text and unlock its content for computerized decision-making and surveillance in healthcare. Conclusions The benefits of this automation include storing all information; making the drafts available and accessible almost instantly to everyone with authorized access; and avoiding information loss, delays, and misinterpretations inherent to using a ward clerk or transcription services. PMID:25336589
Zheng, Shuai; Jabbour, Salma K; O'Reilly, Shannon E; Lu, James J; Dong, Lihua; Ding, Lijuan; Xiao, Ying; Yue, Ning; Wang, Fusheng; Zou, Wei
2018-02-01
In outcome studies of oncology patients undergoing radiation, researchers extract valuable information from medical records generated before, during, and after radiotherapy visits, such as survival data, toxicities, and complications. Clinical studies rely heavily on these data to correlate the treatment regimen with the prognosis to develop evidence-based radiation therapy paradigms. These data are available mainly in forms of narrative texts or table formats with heterogeneous vocabularies. Manual extraction of the related information from these data can be time consuming and labor intensive, which is not ideal for large studies. The objective of this study was to adapt the interactive information extraction platform Information and Data Extraction using Adaptive Learning (IDEAL-X) to extract treatment and prognosis data for patients with locally advanced or inoperable non-small cell lung cancer (NSCLC). We transformed patient treatment and prognosis documents into normalized structured forms using the IDEAL-X system for easy data navigation. The adaptive learning and user-customized controlled toxicity vocabularies were applied to extract categorized treatment and prognosis data, so as to generate structured output. In total, we extracted data from 261 treatment and prognosis documents relating to 50 patients, with overall precision and recall more than 93% and 83%, respectively. For toxicity information extractions, which are important to study patient posttreatment side effects and quality of life, the precision and recall achieved 95.7% and 94.5% respectively. The IDEAL-X system is capable of extracting study data regarding NSCLC chemoradiation patients with significant accuracy and effectiveness, and therefore can be used in large-scale radiotherapy clinical data studies. ©Shuai Zheng, Salma K Jabbour, Shannon E O'Reilly, James J Lu, Lihua Dong, Lijuan Ding, Ying Xiao, Ning Yue, Fusheng Wang, Wei Zou. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.02.2018.
NASA Astrophysics Data System (ADS)
Koga, Kusuto; Hayashi, Yuichiro; Hirose, Tomoaki; Oda, Masahiro; Kitasaka, Takayuki; Igami, Tsuyoshi; Nagino, Masato; Mori, Kensaku
2014-03-01
In this paper, we propose an automated biliary tract extraction method from abdominal CT volumes. The biliary tract is the path by which bile is transported from liver to the duodenum. No extraction method have been reported for the automated extraction of the biliary tract from common contrast CT volumes. Our method consists of three steps including: (1) extraction of extrahepatic bile duct (EHBD) candidate regions, (2) extraction of intrahepatic bile duct (IHBD) candidate regions, and (3) combination of these candidate regions. The IHBD has linear structures and intensities of the IHBD are low in CT volumes. We use a dark linear structure enhancement (DLSE) filter based on a local intensity structure analysis method using the eigenvalues of the Hessian matrix for the IHBD candidate region extraction. The EHBD region is extracted using a thresholding process and a connected component analysis. In the combination process, we connect the IHBD candidate regions to each EHBD candidate region and select a bile duct region from the connected candidate regions. We applied the proposed method to 22 cases of CT volumes. An average Dice coefficient of extraction result was 66.7%.
Automated extraction of pleural effusion in three-dimensional thoracic CT images
NASA Astrophysics Data System (ADS)
Kido, Shoji; Tsunomori, Akinori
2009-02-01
It is important for diagnosis of pulmonary diseases to measure volume of accumulating pleural effusion in threedimensional thoracic CT images quantitatively. However, automated extraction of pulmonary effusion correctly is difficult. Conventional extraction algorithm using a gray-level based threshold can not extract pleural effusion from thoracic wall or mediastinum correctly, because density of pleural effusion in CT images is similar to those of thoracic wall or mediastinum. So, we have developed an automated extraction method of pulmonary effusion by use of extracting lung area with pleural effusion. Our method used a template of lung obtained from a normal lung for segmentation of lungs with pleural effusions. Registration process consisted of two steps. First step was a global matching processing between normal and abnormal lungs of organs such as bronchi, bones (ribs, sternum and vertebrae) and upper surfaces of livers which were extracted using a region-growing algorithm. Second step was a local matching processing between normal and abnormal lungs which were deformed by the parameter obtained from the global matching processing. Finally, we segmented a lung with pleural effusion by use of the template which was deformed by two parameters obtained from the global matching processing and the local matching processing. We compared our method with a conventional extraction method using a gray-level based threshold and two published methods. The extraction rates of pleural effusions obtained from our method were much higher than those obtained from other methods. Automated extraction method of pulmonary effusion by use of extracting lung area with pleural effusion is promising for diagnosis of pulmonary diseases by providing quantitative volume of accumulating pleural effusion.
A fully automated liquid–liquid extraction system utilizing interface detection
Maslana, Eugene; Schmitt, Robert; Pan, Jeffrey
2000-01-01
The development of the Abbott Liquid-Liquid Extraction Station was a result of the need for an automated system to perform aqueous extraction on large sets of newly synthesized organic compounds used for drug discovery. The system utilizes a cylindrical laboratory robot to shuttle sample vials between two loading racks, two identical extraction stations, and a centrifuge. Extraction is performed by detecting the phase interface (by difference in refractive index) of the moving column of fluid drawn from the bottom of each vial containing a biphasic mixture. The integration of interface detection with fluid extraction maximizes sample throughput. Abbott-developed electronics process the detector signals. Sample mixing is performed by high-speed solvent injection. Centrifuging of the samples reduces interface emulsions. Operating software permits the user to program wash protocols with any one of six solvents per wash cycle with as many cycle repeats as necessary. Station capacity is eighty, 15 ml vials. This system has proven successful with a broad spectrum of both ethyl acetate and methylene chloride based chemistries. The development and characterization of this automated extraction system will be presented. PMID:18924693
Information extraction from multi-institutional radiology reports.
Hassanpour, Saeed; Langlotz, Curtis P
2016-01-01
The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. Our results show the efficacy of our machine learning approach in extracting the information model's elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patient's genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval. Copyright © 2015 Elsevier B.V. All rights reserved.
Development of Automated Tracking System with Active Cameras for Figure Skating
NASA Astrophysics Data System (ADS)
Haraguchi, Tomohiko; Taki, Tsuyoshi; Hasegawa, Junichi
This paper presents a system based on the control of PTZ cameras for automated real-time tracking of individual figure skaters moving on an ice rink. In the video images of figure skating, irregular trajectories, various postures, rapid movements, and various costume colors are included. Therefore, it is difficult to determine some features useful for image tracking. On the other hand, an ice rink has a limited area and uniform high intensity, and skating is always performed on ice. In the proposed system, an ice rink region is first extracted from a video image by the region growing method, and then, a skater region is extracted using the rink shape information. In the camera control process, each camera is automatically panned and/or tilted so that the skater region is as close to the center of the image as possible; further, the camera is zoomed to maintain the skater image at an appropriate scale. The results of experiments performed for 10 training scenes show that the skater extraction rate is approximately 98%. Thus, it was concluded that tracking with camera control was successful for almost all the cases considered in the study.
CRIE: An automated analyzer for Chinese texts.
Sung, Yao-Ting; Chang, Tao-Hsing; Lin, Wei-Chun; Hsieh, Kuan-Sheng; Chang, Kuo-En
2016-12-01
Textual analysis has been applied to various fields, such as discourse analysis, corpus studies, text leveling, and automated essay evaluation. Several tools have been developed for analyzing texts written in alphabetic languages such as English and Spanish. However, currently there is no tool available for analyzing Chinese-language texts. This article introduces a tool for the automated analysis of simplified and traditional Chinese texts, called the Chinese Readability Index Explorer (CRIE). Composed of four subsystems and incorporating 82 multilevel linguistic features, CRIE is able to conduct the major tasks of segmentation, syntactic parsing, and feature extraction. Furthermore, the integration of linguistic features with machine learning models enables CRIE to provide leveling and diagnostic information for texts in language arts, texts for learning Chinese as a foreign language, and texts with domain knowledge. The usage and validation of the functions provided by CRIE are also introduced.
aMC fast: automation of fast NLO computations for PDF fits
NASA Astrophysics Data System (ADS)
Bertone, Valerio; Frederix, Rikkert; Frixione, Stefano; Rojo, Juan; Sutton, Mark
2014-08-01
We present the interface between M adG raph5_ aMC@NLO, a self-contained program that calculates cross sections up to next-to-leading order accuracy in an automated manner, and APPL grid, a code that parametrises such cross sections in the form of look-up tables which can be used for the fast computations needed in the context of PDF fits. The main characteristic of this interface, which we dub aMC fast, is its being fully automated as well, which removes the need to extract manually the process-specific information for additional physics processes, as is the case with other matrix-element calculators, and renders it straightforward to include any new process in the PDF fits. We demonstrate this by studying several cases which are easily measured at the LHC, have a good constraining power on PDFs, and some of which were previously unavailable in the form of a fast interface.
An XML-based system for the flexible classification and retrieval of clinical practice guidelines.
Ganslandt, T.; Mueller, M. L.; Krieglstein, C. F.; Senninger, N.; Prokosch, H. U.
2002-01-01
Beneficial effects of clinical practice guidelines (CPGs) have not yet reached expectations due to limited routine adoption. Electronic distribution and reminder systems have the potential to overcome implementation barriers. Existing electronic CPG repositories like the National Guideline Clearinghouse (NGC) provide individual access but lack standardized computer-readable interfaces necessary for automated guideline retrieval. The aim of this paper was to facilitate automated context-based selection and presentation of CPGs. Using attributes from the NGC classification scheme, an XML-based metadata repository was successfully implemented, providing document storage, classification and retrieval functionality. Semi-automated extraction of attributes was implemented for the import of XML guideline documents using XPath. A hospital information system interface was exemplarily implemented for diagnosis-based guideline invocation. Limitations of the implemented system are discussed and possible future work is outlined. Integration of standardized computer-readable search interfaces into existing CPG repositories is proposed. PMID:12463831
Spectral Analysis of Breast Cancer on Tissue Microarrays: Seeing Beyond Morphology
2005-04-01
Harvey N., Szymanski J.J., Bloch J.J., Mitchell M. investigation of image feature extraction by a genetic algorithm. Proc. SPIE 1999;3812:24-31. 11...automated feature extraction using multiple data sources. Proc. SPIE 2003;5099:190-200. 15 4 Spectral-Spatial Analysis of Urine Cytology Angeletti et al...Appendix Contents: 1. Harvey, N.R., Levenson, R.M., Rimm, D.L. (2003) Investigation of Automated Feature Extraction Techniques for Applications in
Automated synthetic scene generation
NASA Astrophysics Data System (ADS)
Givens, Ryan N.
Physics-based simulations generate synthetic imagery to help organizations anticipate system performance of proposed remote sensing systems. However, manually constructing synthetic scenes which are sophisticated enough to capture the complexity of real-world sites can take days to months depending on the size of the site and desired fidelity of the scene. This research, sponsored by the Air Force Research Laboratory's Sensors Directorate, successfully developed an automated approach to fuse high-resolution RGB imagery, lidar data, and hyperspectral imagery and then extract the necessary scene components. The method greatly reduces the time and money required to generate realistic synthetic scenes and developed new approaches to improve material identification using information from all three of the input datasets.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Cáceres, Jesús; Somolinos, Roberto; Pascual, Mario; Martínez, Ignacio; Salvador, Carlos H; Monteagudo, José Luis
2013-01-01
The objective of this paper is to introduce a new language called ccML, designed to provide convenient pragmatic information to applications using the ISO/EN13606 reference model (RM), such as electronic health record (EHR) extracts editors. EHR extracts are presently built using the syntactic and semantic information provided in the RM and constrained by archetypes. The ccML extra information enables the automation of the medico-legal context information edition, which is over 70% of the total in an extract, without modifying the RM information. ccML is defined using a W3C XML schema file. Valid ccML files complement the RM with additional pragmatics information. The ccML language grammar is defined using formal language theory as a single-type tree grammar. The new language is tested using an EHR extracts editor application as proof-of-concept system. Seven ccML PVCodes (predefined value codes) are introduced in this grammar to cope with different realistic EHR edition situations. These seven PVCodes have different interpretation strategies, from direct look up in the ccML file itself, to more complex searches in archetypes or system precomputation. The possibility to declare generic types in ccML gives rise to ambiguity during interpretation. The criterion used to overcome ambiguity is that specificity should prevail over generality. The opposite would make the individual specific element declarations useless. A new mark-up language ccML is introduced that opens up the possibility of providing applications using the ISO/EN13606 RM with the necessary pragmatics information to be practical and realistic.
NASA Astrophysics Data System (ADS)
Rossetti, Cecilia; Świtnicka-Plak, Magdalena A.; Grønhaug Halvorsen, Trine; Cormack, Peter A. G.; Sellergren, Börje; Reubsaet, Léon
2017-03-01
Robust biomarker quantification is essential for the accurate diagnosis of diseases and is of great value in cancer management. In this paper, an innovative diagnostic platform is presented which provides automated molecularly imprinted solid-phase extraction (MISPE) followed by liquid chromatography-mass spectrometry (LC-MS) for biomarker determination using ProGastrin Releasing Peptide (ProGRP), a highly sensitive biomarker for Small Cell Lung Cancer, as a model. Molecularly imprinted polymer microspheres were synthesized by precipitation polymerization and analytical optimization of the most promising material led to the development of an automated quantification method for ProGRP. The method enabled analysis of patient serum samples with elevated ProGRP levels. Particularly low sample volumes were permitted using the automated extraction within a method which was time-efficient, thereby demonstrating the potential of such a strategy in a clinical setting.
Burns, Gully A P C; Dasigi, Pradeep; de Waard, Anita; Hovy, Eduard H
2016-01-01
Automated machine-reading biocuration systems typically use sentence-by-sentence information extraction to construct meaning representations for use by curators. This does not directly reflect the typical discourse structure used by scientists to construct an argument from the experimental data available within a article, and is therefore less likely to correspond to representations typically used in biomedical informatics systems (let alone to the mental models that scientists have). In this study, we develop Natural Language Processing methods to locate, extract, and classify the individual passages of text from articles' Results sections that refer to experimental data. In our domain of interest (molecular biology studies of cancer signal transduction pathways), individual articles may contain as many as 30 small-scale individual experiments describing a variety of findings, upon which authors base their overall research conclusions. Our system automatically classifies discourse segments in these texts into seven categories (fact, hypothesis, problem, goal, method, result, implication) with an F-score of 0.68. These segments describe the essential building blocks of scientific discourse to (i) provide context for each experiment, (ii) report experimental details and (iii) explain the data's meaning in context. We evaluate our system on text passages from articles that were curated in molecular biology databases (the Pathway Logic Datum repository, the Molecular Interaction MINT and INTACT databases) linking individual experiments in articles to the type of assay used (coprecipitation, phosphorylation, translocation etc.). We use supervised machine learning techniques on text passages containing unambiguous references to experiments to obtain baseline F1 scores of 0.59 for MINT, 0.71 for INTACT and 0.63 for Pathway Logic. Although preliminary, these results support the notion that targeting information extraction methods to experimental results could provide accurate, automated methods for biocuration. We also suggest the need for finer-grained curation of experimental methods used when constructing molecular biology databases. © The Author(s) 2016. Published by Oxford University Press.
[Research applications in digital radiology. Big data and co].
Müller, H; Hanbury, A
2016-02-01
Medical imaging produces increasingly complex images (e.g. thinner slices and higher resolution) with more protocols, so that image reading has also become much more complex. More information needs to be processed and usually the number of radiologists available for these tasks has not increased to the same extent. The objective of this article is to present current research results from projects on the use of image data for clinical decision support. An infrastructure that can allow large volumes of data to be accessed is presented. In this way the best performing tools can be identified without the medical data having to leave secure servers. The text presents the results of the VISCERAL and Khresmoi EU-funded projects, which allow the analysis of previous cases from institutional archives to support decision-making and for process automation. The results also represent a secure evaluation environment for medical image analysis. This allows the use of data extracted from past cases to solve information needs occurring when diagnosing new cases. The presented research prototypes allow direct extraction of knowledge from the visual data of the images and to use this for decision support or process automation. Real clinical use has not been tested but several subjective user tests showed the effectiveness and efficiency of the process. The future in radiology will clearly depend on better use of the important knowledge in clinical image archives to automate processes and aid decision-making via big data analysis. This can help concentrate the work of radiologists towards the most important parts of diagnostics.
A solvent-extraction module for cyclotron production of high-purity technetium-99m.
Martini, Petra; Boschi, Alessandra; Cicoria, Gianfranco; Uccelli, Licia; Pasquali, Micòl; Duatti, Adriano; Pupillo, Gaia; Marengo, Mario; Loriggiola, Massimo; Esposito, Juan
2016-12-01
The design and fabrication of a fully-automated, remotely controlled module for the extraction and purification of technetium-99m (Tc-99m), produced by proton bombardment of enriched Mo-100 molybdenum metallic targets in a low-energy medical cyclotron, is here described. After dissolution of the irradiated solid target in hydrogen peroxide, Tc-99m was obtained under the chemical form of 99m TcO 4 - , in high radionuclidic and radiochemical purity, by solvent extraction with methyl ethyl ketone (MEK). The extraction process was accomplished inside a glass column-shaped vial especially designed to allow for an easy automation of the whole procedure. Recovery yields were always >90% of the loaded activity. The final pertechnetate saline solution Na 99m TcO 4 , purified using the automated module here described, is within the Pharmacopoeia quality control parameters and is therefore a valid alternative to generator-produced 99m Tc. The resulting automated module is cost-effective and easily replicable for in-house production of high-purity Tc-99m by cyclotrons. Copyright © 2016 Elsevier Ltd. All rights reserved.
Replacement Sequence of Events Generator
NASA Technical Reports Server (NTRS)
Fisher, Forest; Gladden, Daniel Wenkert Roy; Khanampompan, Teerpat
2008-01-01
The soeWINDOW program automates the generation of an ITAR (International Traffic in Arms Regulations)-compliant sub-RSOE (Replacement Sequence of Events) by extracting a specified temporal window from an RSOE while maintaining page header information. RSOEs contain a significant amount of information that is not ITAR-compliant, yet that foreign partners need to see for command details to their instrument, as well as the surrounding commands that provide context for validation. soeWINDOW can serve as an example of how command support products can be made ITAR-compliant for future missions. This software is a Perl script intended for use in the mission operations UNIX environment. It is designed for use to support the MRO (Mars Reconnaissance Orbiter) instrument team. The tool also provides automated DOM (Distributed Object Manager) storage into the special ITAR-okay DOM collection, and can be used for creating focused RSOEs for product review by any of the MRO teams.
NASA Astrophysics Data System (ADS)
Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Mori, Kensaku; Suenaga, Yasuhito; Hasegawa, Yoshinori; Imaizumi, Kazuyoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi
2008-03-01
This paper presents a method for automated anatomical labeling of bronchial branches (ALBB) extracted from 3D CT datasets. The proposed method constructs classifiers that output anatomical names of bronchial branches by employing the machine-learning approach. We also present its application to a bronchoscopy guidance system. Since the bronchus has a complex tree structure, bronchoscopists easily tend to get disoriented and lose the way to a target location. A bronchoscopy guidance system is strongly expected to be developed to assist bronchoscopists. In such guidance system, automated presentation of anatomical names is quite useful information for bronchoscopy. Although several methods for automated ALBB were reported, most of them constructed models taking only variations of branching patterns into account and did not consider those of running directions. Since the running directions of bronchial branches differ greatly in individuals, they could not perform ALBB accurately when running directions of bronchial branches were different from those of models. Our method tries to solve such problems by utilizing the machine-learning approach. Actual procedure consists of three steps: (a) extraction of bronchial tree structures from 3D CT datasets, (b) construction of classifiers using the multi-class AdaBoost technique, and (c) automated classification of bronchial branches by using the constructed classifiers. We applied the proposed method to 51 cases of 3D CT datasets. The constructed classifiers were evaluated by leave-one-out scheme. The experimental results showed that the proposed method could assign correct anatomical names to bronchial branches of 89.1% up to segmental lobe branches. Also, we confirmed that it was quite useful to assist the bronchoscopy by presenting anatomical names of bronchial branches on real bronchoscopic views.
NASA Astrophysics Data System (ADS)
Jenerowicz, Małgorzata; Kemper, Thomas
2016-10-01
Every year thousands of people are displaced by conflicts or natural disasters and often gather in large camps. Knowing how many people have been gathered is crucial for an efficient relief operation. However, it is often difficult to collect exact information on the total number of the population. This paper presents the improved morphological methodology for the estimation of dwellings structures located in several Internally Displaced Persons (IDPs) Camps, based on Very High Resolution (VHR) multispectral satellite imagery with pixel sizes of 1 meter or less including GeoEye-1, WorldView-2, QuickBird-2, Ikonos-2, Pléiades-A and Pléiades-B. The main topic of this paper is the approach enhancement with selection of feature extraction algorithm, the improvement and automation of pre-processing and results verification. For the informal and temporary dwellings extraction purpose the high quality of data has to be ensured. The pre-processing has been extended by including the input data hierarchy level assignment and data fusion method selection and evaluation. The feature extraction algorithm follows the procedure presented in Jenerowicz, M., Kemper, T., 2011. Optical data are analysed in a cyclic approach comprising image segmentation, geometrical, textural and spectral class modeling aiming at camp area identification. The successive steps of morphological processing have been combined in a one stand-alone application for automatic dwellings detection and enumeration. Actively implemented, these approaches can provide a reliable and consistent results, independent of the imaging satellite type and different study sites location, providing decision support in emergency response for the humanitarian community like United Nations, European Union and Non-Governmental relief organizations.
Automated Fluid Feature Extraction from Transient Simulations
NASA Technical Reports Server (NTRS)
Haimes, Robert; Lovely, David
1999-01-01
In the past, feature extraction and identification were interesting concepts, but not required to understand the underlying physics of a steady flow field. This is because the results of the more traditional tools like iso-surfaces, cuts and streamlines were more interactive and easily abstracted so they could be represented to the investigator. These tools worked and properly conveyed the collected information at the expense of much interaction. For unsteady flow-fields, the investigator does not have the luxury of spending time scanning only one "snap-shot" of the simulation. Automated assistance is required in pointing out areas of potential interest contained within the flow. This must not require a heavy compute burden (the visualization should not significantly slow down the solution procedure for co-processing environments like pV3). And methods must be developed to abstract the feature and display it in a manner that physically makes sense. The following is a list of the important physical phenomena found in transient (and steady-state) fluid flow: (1) Shocks, (2) Vortex cores, (3) Regions of recirculation, (4) Boundary layers, (5) Wakes. Three papers and an initial specification for the (The Fluid eXtraction tool kit) FX Programmer's guide were included. The papers, submitted to the AIAA Computational Fluid Dynamics Conference, are entitled : (1) Using Residence Time for the Extraction of Recirculation Regions, (2) Shock Detection from Computational Fluid Dynamics results and (3) On the Velocity Gradient Tensor and Fluid Feature Extraction.
Stanislawski, Larry V.; Falgout, Jeff T.; Buttenfield, Barbara P.
2015-01-01
Hydrographic networks form an important data foundation for cartographic base mapping and for hydrologic analysis. Drainage density patterns for these networks can be derived to characterize local landscape, bedrock and climate conditions, and further inform hydrologic and geomorphological analysis by indicating areas where too few headwater channels have been extracted. But natural drainage density patterns are not consistently available in existing hydrographic data for the United States because compilation and capture criteria historically varied, along with climate, during the period of data collection over the various terrain types throughout the country. This paper demonstrates an automated workflow that is being tested in a high-performance computing environment by the U.S. Geological Survey (USGS) to map natural drainage density patterns at the 1:24,000-scale (24K) for the conterminous United States. Hydrographic network drainage patterns may be extracted from elevation data to guide corrections for existing hydrographic network data. The paper describes three stages in this workflow including data pre-processing, natural channel extraction, and generation of drainage density patterns from extracted channels. The workflow is concurrently implemented by executing procedures on multiple subbasin watersheds within the U.S. National Hydrography Dataset (NHD). Pre-processing defines parameters that are needed for the extraction process. Extraction proceeds in standard fashion: filling sinks, developing flow direction and weighted flow accumulation rasters. Drainage channels with assigned Strahler stream order are extracted within a subbasin and simplified. Drainage density patterns are then estimated with 100-meter resolution and subsequently smoothed with a low-pass filter. The extraction process is found to be of better quality in higher slope terrains. Concurrent processing through the high performance computing environment is shown to facilitate and refine the choice of drainage density extraction parameters and more readily improve extraction procedures than conventional processing.
Using text mining techniques to extract phenotypic information from the PhenoCHF corpus
2015-01-01
Background Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. Methods To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Results Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. Conclusions PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus. PMID:26099853
Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.
Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia
2015-01-01
Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single disease, the promising results achieved can stimulate further work into the extraction of phenotypic information for other diseases. The PhenoCHF annotation guidelines and annotations are publicly available at https://code.google.com/p/phenochf-corpus.
Current trends in geomorphological mapping
NASA Astrophysics Data System (ADS)
Seijmonsbergen, A. C.
2012-04-01
Geomorphological mapping is a world currently in motion, driven by technological advances and the availability of new high resolution data. As a consequence, classic (paper) geomorphological maps which were the standard for more than 50 years are rapidly being replaced by digital geomorphological information layers. This is witnessed by the following developments: 1. the conversion of classic paper maps into digital information layers, mainly performed in a digital mapping environment such as a Geographical Information System, 2. updating the location precision and the content of the converted maps, by adding more geomorphological details, taken from high resolution elevation data and/or high resolution image data, 3. (semi) automated extraction and classification of geomorphological features from digital elevation models, broadly separated into unsupervised and supervised classification techniques and 4. New digital visualization / cartographic techniques and reading interfaces. Newly digital geomorphological information layers can be based on manual digitization of polygons using DEMs and/or aerial photographs, or prepared through (semi) automated extraction and delineation of geomorphological features. DEMs are often used as basis to derive Land Surface Parameter information which is used as input for (un) supervised classification techniques. Especially when using high-res data, object-based classification is used as an alternative to traditional pixel-based classifications, to cluster grid cells into homogeneous objects, which can be classified as geomorphological features. Classic map content can also be used as training material for the supervised classification of geomorphological features. In the classification process, rule-based protocols, including expert-knowledge input, are used to map specific geomorphological features or entire landscapes. Current (semi) automated classification techniques are increasingly able to extract morphometric, hydrological, and in the near future also morphogenetic information. As a result, these new opportunities have changed the workflows for geomorphological mapmaking, and their focus have shifted from field-based techniques to using more computer-based techniques: for example, traditional pre-field air-photo based maps are now replaced by maps prepared in a digital mapping environment, and designated field visits using mobile GIS / digital mapping devices now focus on gathering location information and attribute inventories and are strongly time efficient. The resulting 'modern geomorphological maps' are digital collections of geomorphological information layers consisting of georeferenced vector, raster and tabular data which are stored in a digital environment such as a GIS geodatabase, and are easily visualized as e.g. 'birds' eye' views, as animated 3D displays, on virtual globes, or stored as GeoPDF maps in which georeferenced attribute information can be easily exchanged over the internet. Digital geomorphological information layers are increasingly accessed via web-based services distributed through remote servers. Information can be consulted - or even build using remote geoprocessing servers - by the end user. Therefore, it will not only be the geomorphologist anymore, but also the professional end user that dictates the applied use of digital geomorphological information layers.
Janiszewski, J; Schneider, P; Hoffmaster, K; Swyden, M; Wells, D; Fouda, H
1997-01-01
The development and application of membrane solid phase extraction (SPE) in 96-well microtiter plate format is described for the automated analysis of drugs in biological fluids. The small bed volume of the membrane allows elution of the analyte in a very small solvent volume, permitting direct HPLC injection and negating the need for the time consuming solvent evaporation step. A programmable liquid handling station (Quadra 96) was modified to automate all SPE steps. To avoid drying of the SPE bed and to enhance the analytical precision a novel protocol for performing the condition, load and wash steps in rapid succession was utilized. A block of 96 samples can now be extracted in 10 min., about 30 times faster than manual solvent extraction or single cartridge SPE methods. This processing speed complements the high-throughput speed of contemporary high performance liquid chromatography mass spectrometry (HPLC/MS) analysis. The quantitative analysis of a test analyte (Ziprasidone) in plasma demonstrates the utility and throughput of membrane SPE in combination with HPLC/MS. The results obtained with the current automated procedure compare favorably with those obtained using solvent and traditional solid phase extraction methods. The method has been used for the analysis of numerous drug prototypes in biological fluids to support drug discovery efforts.
ERIC Educational Resources Information Center
Thompson, Kate; Kennedy-Clark, Shannon; Wheeler, Penny; Kelly, Nick
2014-01-01
This paper describes a technique for locating indicators of success within the data collected from complex learning environments, proposing an application of e-research to access learner processes and measure and track group progress. The technique combines automated extraction of tense and modality via parts-of-speech tagging with a visualisation…
Parsing and Tagging of Bilingual Dictionary
2003-09-01
LAMP-TR-106 CAR-TR-991 CS-TR-4529 UMIACS-TR-2003-97 PARSING ANS TAGGING OF BILINGUAL DICTIONARY Huanfeng Ma1,2, Burcu Karagol-Ayan1,2, David... dictionaries hold great potential as a source of lexical resources for training and testing automated systems for optical character recognition, machine...translation, and cross-language information retrieval. In this paper, we describe a system for extracting term lexicons from printed bilingual dictionaries
Ota, Hiroyuki; Lim, Tae-Kyu; Tanaka, Tsuyoshi; Yoshino, Tomoko; Harada, Manabu; Matsunaga, Tadashi
2006-09-18
A novel, automated system, PNE-1080, equipped with eight automated pestle units and a spectrophotometer was developed for genomic DNA extraction from maize using aminosilane-modified bacterial magnetic particles (BMPs). The use of aminosilane-modified BMPs allowed highly accurate DNA recovery. The (A(260)-A(320)):(A(280)-A(320)) ratio of the extracted DNA was 1.9+/-0.1. The DNA quality was sufficiently pure for PCR analysis. The PNE-1080 offered rapid assay completion (30 min) with high accuracy. Furthermore, the results of real-time PCR confirmed that our proposed method permitted the accurate determination of genetically modified DNA composition and correlated well with results obtained by conventional cetyltrimethylammonium bromide (CTAB)-based methods.
Meystre, Stephane; Gouripeddi, Ramkiran; Tieder, Joel; Simmons, Jeffrey; Srivastava, Rajendu; Shah, Samir
2017-05-15
Community-acquired pneumonia is a leading cause of pediatric morbidity. Administrative data are often used to conduct comparative effectiveness research (CER) with sufficient sample sizes to enhance detection of important outcomes. However, such studies are prone to misclassification errors because of the variable accuracy of discharge diagnosis codes. The aim of this study was to develop an automated, scalable, and accurate method to determine the presence or absence of pneumonia in children using chest imaging reports. The multi-institutional PHIS+ clinical repository was developed to support pediatric CER by expanding an administrative database of children's hospitals with detailed clinical data. To develop a scalable approach to find patients with bacterial pneumonia more accurately, we developed a Natural Language Processing (NLP) application to extract relevant information from chest diagnostic imaging reports. Domain experts established a reference standard by manually annotating 282 reports to train and then test the NLP application. Findings of pleural effusion, pulmonary infiltrate, and pneumonia were automatically extracted from the reports and then used to automatically classify whether a report was consistent with bacterial pneumonia. Compared with the annotated diagnostic imaging reports reference standard, the most accurate implementation of machine learning algorithms in our NLP application allowed extracting relevant findings with a sensitivity of .939 and a positive predictive value of .925. It allowed classifying reports with a sensitivity of .71, a positive predictive value of .86, and a specificity of .962. When compared with each of the domain experts manually annotating these reports, the NLP application allowed for significantly higher sensitivity (.71 vs .527) and similar positive predictive value and specificity . NLP-based pneumonia information extraction of pediatric diagnostic imaging reports performed better than domain experts in this pilot study. NLP is an efficient method to extract information from a large collection of imaging reports to facilitate CER. ©Stephane Meystre, Ramkiran Gouripeddi, Joel Tieder, Jeffrey Simmons, Rajendu Srivastava, Samir Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.05.2017.
Lerch, Oliver; Temme, Oliver; Daldrup, Thomas
2014-07-01
The analysis of opioids, cocaine, and metabolites from blood serum is a routine task in forensic laboratories. Commonly, the employed methods include many manual or partly automated steps like protein precipitation, dilution, solid phase extraction, evaporation, and derivatization preceding a gas chromatography (GC)/mass spectrometry (MS) or liquid chromatography (LC)/MS analysis. In this study, a comprehensively automated method was developed from a validated, partly automated routine method. This was possible by replicating method parameters on the automated system. Only marginal optimization of parameters was necessary. The automation relying on an x-y-z robot after manual protein precipitation includes the solid phase extraction, evaporation of the eluate, derivatization (silylation with N-methyl-N-trimethylsilyltrifluoroacetamide, MSTFA), and injection into a GC/MS. A quantitative analysis of almost 170 authentic serum samples and more than 50 authentic samples of other matrices like urine, different tissues, and heart blood on cocaine, benzoylecgonine, methadone, morphine, codeine, 6-monoacetylmorphine, dihydrocodeine, and 7-aminoflunitrazepam was conducted with both methods proving that the analytical results are equivalent even near the limits of quantification (low ng/ml range). To our best knowledge, this application is the first one reported in the literature employing this sample preparation system.
Natural Language Processing in Radiology: A Systematic Review.
Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A
2016-05-01
Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.
The 1984 NASA/ASEE summer faculty fellowship program
NASA Technical Reports Server (NTRS)
1984-01-01
The assessment of forest productivity and associated nitrogen flux in a number of conifer ecosystems is described. As a base line study of acid precipitation in the Sierra Nevada, involved is the extraction and integration of a number of data planes describing the terrain, soils, lithology, vegetation cover and structure, and microclimate of the region. The development of automated techniques to extract topographic networks (stream canyons and ridge lines) for use as a landscrape skeleton to organize and integrate data sets into an efficient geographical information system is examined. The software is written in both FORTRAN and C, and is portable to a number of different computer environments with minimal modification.
DEXTER: Disease-Expression Relation Extraction from Text.
Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K
2018-01-01
Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.
Towards automated segmentation of cells and cell nuclei in nonlinear optical microscopy.
Medyukhina, Anna; Meyer, Tobias; Schmitt, Michael; Romeike, Bernd F M; Dietzek, Benjamin; Popp, Jürgen
2012-11-01
Nonlinear optical (NLO) imaging techniques based e.g. on coherent anti-Stokes Raman scattering (CARS) or two photon excited fluorescence (TPEF) show great potential for biomedical imaging. In order to facilitate the diagnostic process based on NLO imaging, there is need for an automated calculation of quantitative values such as cell density, nucleus-to-cytoplasm ratio, average nuclear size. Extraction of these parameters is helpful for the histological assessment in general and specifically e.g. for the determination of tumor grades. This requires an accurate image segmentation and detection of locations and boundaries of cells and nuclei. Here we present an image processing approach for the detection of nuclei and cells in co-registered TPEF and CARS images. The algorithm developed utilizes the gray-scale information for the detection of the nuclei locations and the gradient information for the delineation of the nuclear and cellular boundaries. The approach reported is capable for an automated segmentation of cells and nuclei in multimodal TPEF-CARS images of human brain tumor samples. The results are important for the development of NLO microscopy into a clinically relevant diagnostic tool. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Ruan, W; Bürkle, T; Dudeck, J
2000-01-01
In this paper we present a data dictionary server for the automated navigation of information sources. The underlying knowledge is represented within a medical data dictionary. The mapping between medical terms and information sources is based on a semantic network. The key aspect of implementing the dictionary server is how to represent the semantic network in a way that is easier to navigate and to operate, i.e. how to abstract the semantic network and to represent it in memory for various operations. This paper describes an object-oriented design based on Java that represents the semantic network in terms of a group of objects. A node and its relationships to its neighbors are encapsulated in one object. Based on such a representation model, several operations have been implemented. They comprise the extraction of parts of the semantic network which can be reached from a given node as well as finding all paths between a start node and a predefined destination node. This solution is independent of any given layout of the semantic structure. Therefore the module, called Giessen Data Dictionary Server can act independent of a specific clinical information system. The dictionary server will be used to present clinical information, e.g. treatment guidelines or drug information sources to the clinician in an appropriate working context. The server is invoked from clinical documentation applications which contain an infobutton. Automated navigation will guide the user to all the information relevant to her/his topic, which is currently available inside our closed clinical network.
Ni, Yan; Su, Mingming; Qiu, Yunping; Jia, Wei
2017-01-01
ADAP-GC is an automated computational pipeline for untargeted, GC-MS-based metabolomics studies. It takes raw mass spectrometry data as input and carries out a sequence of data processing steps including construction of extracted ion chromatograms, detection of chromatographic peak features, deconvolution of co-eluting compounds, and alignment of compounds across samples. Despite the increased accuracy from the original version to version 2.0 in terms of extracting metabolite information for identification and quantitation, ADAP-GC 2.0 requires appropriate specification of a number of parameters and has difficulty in extracting information of compounds that are in low concentration. To overcome these two limitations, ADAP-GC 3.0 was developed to improve both the robustness and sensitivity of compound detection. In this paper, we report how these goals were achieved and compare ADAP-GC 3.0 against three other software tools including ChromaTOF, AnalyzerPro, and AMDIS that are widely used in the metabolomics community. PMID:27461032
Ni, Yan; Su, Mingming; Qiu, Yunping; Jia, Wei; Du, Xiuxia
2016-09-06
ADAP-GC is an automated computational pipeline for untargeted, GC/MS-based metabolomics studies. It takes raw mass spectrometry data as input and carries out a sequence of data processing steps including construction of extracted ion chromatograms, detection of chromatographic peak features, deconvolution of coeluting compounds, and alignment of compounds across samples. Despite the increased accuracy from the original version to version 2.0 in terms of extracting metabolite information for identification and quantitation, ADAP-GC 2.0 requires appropriate specification of a number of parameters and has difficulty in extracting information on compounds that are in low concentration. To overcome these two limitations, ADAP-GC 3.0 was developed to improve both the robustness and sensitivity of compound detection. In this paper, we report how these goals were achieved and compare ADAP-GC 3.0 against three other software tools including ChromaTOF, AnalyzerPro, and AMDIS that are widely used in the metabolomics community.
Aprea, Eugenio; Gika, Helen; Carlin, Silvia; Theodoridis, Georgios; Vrhovsek, Urska; Mattivi, Fulvio
2011-07-15
A headspace SPME GC-TOF-MS method was developed for the acquisition of metabolite profiles of apple volatiles. As a first step, an experimental design was applied to find out the most appropriate conditions for the extraction of apple volatile compounds by SPME. The selected SPME method was applied in profiling of four different apple varieties by GC-EI-TOF-MS. Full scan GC-MS data were processed by MarkerLynx software for peak picking, normalisation, alignment and feature extraction. Advanced chemometric/statistical techniques (PCA and PLS-DA) were used to explore data and extract useful information. Characteristic markers of each variety were successively identified using the NIST library thus providing useful information for variety classification. The developed HS-SPME sampling method is fully automated and proved useful in obtaining the fingerprint of the volatile content of the fruit. The described analytical protocol can aid in further studies of the apple metabolome. Copyright © 2011 Elsevier B.V. All rights reserved.
Chatterjee, Anirban; Mirer, Paul L; Zaldivar Santamaria, Elvira; Klapperich, Catherine; Sharon, Andre; Sauer-Budge, Alexis F
2010-06-01
The life science and healthcare communities have been redefining the importance of ribonucleic acid (RNA) through the study of small molecule RNA (in RNAi/siRNA technologies), micro RNA (in cancer research and stem cell research), and mRNA (gene expression analysis for biologic drug targets). Research in this field increasingly requires efficient and high-throughput isolation techniques for RNA. Currently, several commercial kits are available for isolating RNA from cells. Although the quality and quantity of RNA yielded from these kits is sufficiently good for many purposes, limitations exist in terms of extraction efficiency from small cell populations and the ability to automate the extraction process. Traditionally, automating a process decreases the cost and personnel time while simultaneously increasing the throughput and reproducibility. As the RNA field matures, new methods for automating its extraction, especially from low cell numbers and in high throughput, are needed to achieve these improvements. The technology presented in this article is a step toward this goal. The method is based on a solid-phase extraction technology using a porous polymer monolith (PPM). A novel cell lysis approach and a larger binding surface throughout the PPM extraction column ensure a high yield from small starting samples, increasing sensitivity and reducing indirect costs in cell culture and sample storage. The method ensures a fast and simple procedure for RNA isolation from eukaryotic cells, with a high yield both in terms of quality and quantity. The technique is amenable to automation and streamlined workflow integration, with possible miniaturization of the sample handling process making it suitable for high-throughput applications.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Cáceres, Jesús; Somolinos, Roberto; Pascual, Mario; Martínez, Ignacio; Salvador, Carlos H; Monteagudo, José Luis
2013-01-01
Objective The objective of this paper is to introduce a new language called ccML, designed to provide convenient pragmatic information to applications using the ISO/EN13606 reference model (RM), such as electronic health record (EHR) extracts editors. EHR extracts are presently built using the syntactic and semantic information provided in the RM and constrained by archetypes. The ccML extra information enables the automation of the medico-legal context information edition, which is over 70% of the total in an extract, without modifying the RM information. Materials and Methods ccML is defined using a W3C XML schema file. Valid ccML files complement the RM with additional pragmatics information. The ccML language grammar is defined using formal language theory as a single-type tree grammar. The new language is tested using an EHR extracts editor application as proof-of-concept system. Results Seven ccML PVCodes (predefined value codes) are introduced in this grammar to cope with different realistic EHR edition situations. These seven PVCodes have different interpretation strategies, from direct look up in the ccML file itself, to more complex searches in archetypes or system precomputation. Discussion The possibility to declare generic types in ccML gives rise to ambiguity during interpretation. The criterion used to overcome ambiguity is that specificity should prevail over generality. The opposite would make the individual specific element declarations useless. Conclusion A new mark-up language ccML is introduced that opens up the possibility of providing applications using the ISO/EN13606 RM with the necessary pragmatics information to be practical and realistic. PMID:23019241
Automated Text Markup for Information Retrieval from an Electronic Textbook of Infectious Disease
Berrios, Daniel C.; Kehler, Andrew; Kim, David K.; Yu, Victor L.; Fagan, Lawrence M.
1998-01-01
The information needs of practicing clinicians frequently require textbook or journal searches. Making these sources available in electronic form improves the speed of these searches, but precision (i.e., the fraction of relevant to total documents retrieved) remains low. Improving the traditional keyword search by transforming search terms into canonical concepts does not improve search precision greatly. Kim et al. have designed and built a prototype system (MYCIN II) for computer-based information retrieval from a forthcoming electronic textbook of infectious disease. The system requires manual indexing by experts in the form of complex text markup. However, this mark-up process is time consuming (about 3 person-hours to generate, review, and transcribe the index for each of 218 chapters). We have designed and implemented a system to semiautomate the markup process. The system, information extraction for semiautomated indexing of documents (ISAID), uses query models and existing information-extraction tools to provide support for any user, including the author of the source material, to mark up tertiary information sources quickly and accurately.
A thesis on the Development of an Automated SWIFT Edge Detection Algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trujillo, Christopher J.
Throughout the world, scientists and engineers such as those at Los Alamos National Laboratory, perform research and testing unique only to applications aimed towards advancing technology, and understanding the nature of materials. With this testing, comes a need for advanced methods of data acquisition and most importantly, a means of analyzing and extracting the necessary information from such acquired data. In this thesis, I aim to produce an automated method implementing advanced image processing techniques and tools to analyze SWIFT image datasets for Detonator Technology at Los Alamos National Laboratory. Such an effective method for edge detection and point extractionmore » can prove to be advantageous in analyzing such unique datasets and provide for consistency in producing results.« less
The LabTube - a novel microfluidic platform for assay automation in laboratory centrifuges.
Kloke, A; Fiebach, A R; Zhang, S; Drechsel, L; Niekrawietz, S; Hoehl, M M; Kneusel, R; Panthel, K; Steigert, J; von Stetten, F; Zengerle, R; Paust, N
2014-05-07
Assay automation is the key for successful transformation of modern biotechnology into routine workflows. Yet, it requires considerable investment in processing devices and auxiliary infrastructure, which is not cost-efficient for laboratories with low or medium sample throughput or point-of-care testing. To close this gap, we present the LabTube platform, which is based on assay specific disposable cartridges for processing in laboratory centrifuges. LabTube cartridges comprise interfaces for sample loading and downstream applications and fluidic unit operations for release of prestored reagents, mixing, and solid phase extraction. Process control is achieved by a centrifugally-actuated ballpen mechanism. To demonstrate the workflow and functionality of the LabTube platform, we show two LabTube automated sample preparation assays from laboratory routines: DNA extractions from whole blood and purification of His-tagged proteins. Equal DNA and protein yields were observed compared to manual reference runs, while LabTube automation could significantly reduce the hands-on-time to one minute per extraction.
Integrated Computational System for Aerodynamic Steering and Visualization
NASA Technical Reports Server (NTRS)
Hesselink, Lambertus
1999-01-01
In February of 1994, an effort from the Fluid Dynamics and Information Sciences Divisions at NASA Ames Research Center with McDonnel Douglas Aerospace Company and Stanford University was initiated to develop, demonstrate, validate and disseminate automated software for numerical aerodynamic simulation. The goal of the initiative was to develop a tri-discipline approach encompassing CFD, Intelligent Systems, and Automated Flow Feature Recognition to improve the utility of CFD in the design cycle. This approach would then be represented through an intelligent computational system which could accept an engineer's definition of a problem and construct an optimal and reliable CFD solution. Stanford University's role focused on developing technologies that advance visualization capabilities for analysis of CFD data, extract specific flow features useful for the design process, and compare CFD data with experimental data. During the years 1995-1997, Stanford University focused on developing techniques in the area of tensor visualization and flow feature extraction. Software libraries were created enabling feature extraction and exploration of tensor fields. As a proof of concept, a prototype system called the Integrated Computational System (ICS) was developed to demonstrate CFD design cycle. The current research effort focuses on finding a quantitative comparison of general vector fields based on topological features. Since the method relies on topological information, grid matching and vector alignment is not needed in the comparison. This is often a problem with many data comparison techniques. In addition, since only topology based information is stored and compared for each field, there is a significant compression of information that enables large databases to be quickly searched. This report will (1) briefly review the technologies developed during 1995-1997 (2) describe current technologies in the area of comparison techniques, (4) describe the theory of our new method researched during the grant year (5) summarize a few of the results and finally (6) discuss work within the last 6 months that are direct extensions from the grant.
Yera, H.; Filisetti, D.; Bastien, P.; Ancelle, T.; Thulliez, P.; Delhaes, L.
2009-01-01
Over the past few years, a number of new nucleic acid extraction methods and extraction platforms using chemistry combined with magnetic or silica particles have been developed, in combination with instruments to facilitate the extraction procedure. The objective of the present study was to investigate the suitability of these automated methods for the isolation of Toxoplasma gondii DNA from amniotic fluid (AF). Therefore, three automated procedures were compared to two commercialized manual extraction methods. The MagNA Pure Compact (Roche), BioRobot EZ1 (Qiagen), and easyMAG (bioMérieux) automated procedures were compared to two manual DNA extraction kits, the QIAamp DNA minikit (Qiagen) and the High Pure PCR template preparation kit (Roche). Evaluation was carried out with two specific Toxoplasma PCRs (targeting the 529-bp repeat element), inhibitor search PCRs, and human beta-globin PCRs. The samples each consisted of 4 ml of AF with or without a calibrated Toxoplasma gondii RH strain suspension (0, 1, 2.5, 5, and 25 tachyzoites/ml). All PCR assays were laboratory-developed real-time PCR assays, using either TaqMan or fluorescent resonance energy transfer probes. A total of 1,178 PCRs were performed, including 978 Toxoplasma PCRs. The automated and manual methods were similar in sensitivity for DNA extraction from T. gondii at the highest concentration (25 Toxoplasma gondii cells/ml). However, our results showed that the DNA extraction procedures led to variable efficacy in isolating low concentrations of tachyzoites in AF samples (<5 Toxoplasma gondii cells/ml), a difference that might have repercussions since low parasite concentrations in AF exist and can lead to congenital toxoplasmosis. PMID:19846633
An Automated Solar Synoptic Analysis Software System
NASA Astrophysics Data System (ADS)
Hong, S.; Lee, S.; Oh, S.; Kim, J.; Lee, J.; Kim, Y.; Lee, J.; Moon, Y.; Lee, D.
2012-12-01
We have developed an automated software system of identifying solar active regions, filament channels, and coronal holes, those are three major solar sources causing the space weather. Space weather forecasters of NOAA Space Weather Prediction Center produce the solar synoptic drawings as a daily basis to predict solar activities, i.e., solar flares, filament eruptions, high speed solar wind streams, and co-rotating interaction regions as well as their possible effects to the Earth. As an attempt to emulate this process with a fully automated and consistent way, we developed a software system named ASSA(Automated Solar Synoptic Analysis). When identifying solar active regions, ASSA uses high-resolution SDO HMI intensitygram and magnetogram as inputs and providing McIntosh classification and Mt. Wilson magnetic classification of each active region by applying appropriate image processing techniques such as thresholding, morphology extraction, and region growing. At the same time, it also extracts morphological and physical properties of active regions in a quantitative way for the short-term prediction of flares and CMEs. When identifying filament channels and coronal holes, images of global H-alpha network and SDO AIA 193 are used for morphological identification and also SDO HMI magnetograms for quantitative verification. The output results of ASSA are routinely checked and validated against NOAA's daily SRS(Solar Region Summary) and UCOHO(URSIgram code for coronal hole information). A couple of preliminary scientific results are to be presented using available output results. ASSA will be deployed at the Korean Space Weather Center and serve its customers in an operational status by the end of 2012.
Peters, Sonja; Kaal, Erwin; Horsting, Iwan; Janssen, Hans-Gerd
2012-02-24
A new method is presented for the analysis of phenolic acids in plasma based on ion-pairing 'Micro-extraction in packed sorbent' (MEPS) coupled on-line to in-liner derivatisation-gas chromatography-mass spectrometry (GC-MS). The ion-pairing reagent served a dual purpose. It was used both to improve extraction yields of the more polar analytes and as the methyl donor in the automated in-liner derivatisation method. In this way, a fully automated procedure for the extraction, derivatisation and injection of a wide range of phenolic acids in plasma samples has been obtained. An extensive optimisation of the extraction and derivatisation procedure has been performed. The entire method showed excellent repeatabilities of under 10% and linearities of 0.99 or better for all phenolic acids. The limits of detection of the optimised method for the majority of phenolic acids were 10ng/mL or lower with three phenolic acids having less-favourable detection limits of around 100 ng/mL. Finally, the newly developed method has been applied in a human intervention trial in which the bioavailability of polyphenols from wine and tea was studied. Forty plasma samples could be analysed within 24h in a fully automated method including sample extraction, derivatisation and gas chromatographic analysis. Copyright © 2011 Elsevier B.V. All rights reserved.
Composite Wavelet Filters for Enhanced Automated Target Recognition
NASA Technical Reports Server (NTRS)
Chiang, Jeffrey N.; Zhang, Yuhan; Lu, Thomas T.; Chao, Tien-Hsin
2012-01-01
Automated Target Recognition (ATR) systems aim to automate target detection, recognition, and tracking. The current project applies a JPL ATR system to low-resolution sonar and camera videos taken from unmanned vehicles. These sonar images are inherently noisy and difficult to interpret, and pictures taken underwater are unreliable due to murkiness and inconsistent lighting. The ATR system breaks target recognition into three stages: 1) Videos of both sonar and camera footage are broken into frames and preprocessed to enhance images and detect Regions of Interest (ROIs). 2) Features are extracted from these ROIs in preparation for classification. 3) ROIs are classified as true or false positives using a standard Neural Network based on the extracted features. Several preprocessing, feature extraction, and training methods are tested and discussed in this paper.
ICECAP: an integrated, general-purpose, automation-assisted IC50/EC50 assay platform.
Li, Ming; Chou, Judy; King, Kristopher W; Jing, Jing; Wei, Dong; Yang, Liyu
2015-02-01
IC50 and EC50 values are commonly used to evaluate drug potency. Mass spectrometry (MS)-centric bioanalytical and biomarker labs are now conducting IC50/EC50 assays, which, if done manually, are tedious and error-prone. Existing bioanalytical sample preparation automation systems cannot meet IC50/EC50 assay throughput demand. A general-purpose, automation-assisted IC50/EC50 assay platform was developed to automate the calculations of spiking solutions and the matrix solutions preparation scheme, the actual spiking and matrix solutions preparations, as well as the flexible sample extraction procedures after incubation. In addition, the platform also automates the data extraction, nonlinear regression curve fitting, computation of IC50/EC50 values, graphing, and reporting. The automation-assisted IC50/EC50 assay platform can process the whole class of assays of varying assay conditions. In each run, the system can handle up to 32 compounds and up to 10 concentration levels per compound, and it greatly improves IC50/EC50 assay experimental productivity and data processing efficiency. © 2014 Society for Laboratory Automation and Screening.
Mapping Urban Ecosystem Services Using High Resolution Aerial Photography
NASA Astrophysics Data System (ADS)
Pilant, A. N.; Neale, A.; Wilhelm, D.
2010-12-01
Ecosystem services (ES) are the many life-sustaining benefits we receive from nature: e.g., clean air and water, food and fiber, cultural-aesthetic-recreational benefits, pollination and flood control. The ES concept is emerging as a means of integrating complex environmental and economic information to support informed environmental decision making. The US EPA is developing a web-based National Atlas of Ecosystem Services, with a component for urban ecosystems. Currently, the only wall-to-wall, national scale land cover data suitable for this analysis is the National Land Cover Data (NLCD) at 30 m spatial resolution with 5 and 10 year updates. However, aerial photography is acquired at higher spatial resolution (0.5-3 m) and more frequently (1-5 years, typically) for most urban areas. Land cover was mapped in Raleigh, NC using freely available USDA National Agricultural Imagery Program (NAIP) with 1 m ground sample distance to test the suitability of aerial photography for urban ES analysis. Automated feature extraction techniques were used to extract five land cover classes, and an accuracy assessment was performed using standard techniques. Results will be presented that demonstrate applications to mapping ES in urban environments: greenways, corridors, fragmentation, habitat, impervious surfaces, dark and light pavement (urban heat island). Automated feature extraction results mapped over NAIP color aerial photograph. At this scale, we can look at land cover and related ecosystem services at the 2-10 m scale. Small features such as individual trees and sidewalks are visible and mappable. Classified aerial photo of Downtown Raleigh NC Red: impervious surface Dark Green: trees Light Green: grass Tan: soil
Quantitative pathology in virtual microscopy: history, applications, perspectives.
Kayser, Gian; Kayser, Klaus
2013-07-01
With the emerging success of commercially available personal computers and the rapid progress in the development of information technologies, morphometric analyses of static histological images have been introduced to improve our understanding of the biology of diseases such as cancer. First applications have been quantifications of immunohistochemical expression patterns. In addition to object counting and feature extraction, laws of thermodynamics have been applied in morphometric calculations termed syntactic structure analysis. Here, one has to consider that the information of an image can be calculated for separate hierarchical layers such as single pixels, cluster of pixels, segmented small objects, clusters of small objects, objects of higher order composed of several small objects. Using syntactic structure analysis in histological images, functional states can be extracted and efficiency of labor in tissues can be quantified. Image standardization procedures, such as shading correction and color normalization, can overcome artifacts blurring clear thresholds. Morphometric techniques are not only useful to learn more about biological features of growth patterns, they can also be helpful in routine diagnostic pathology. In such cases, entropy calculations are applied in analogy to theoretical considerations concerning information content. Thus, regions with high information content can automatically be highlighted. Analysis of the "regions of high diagnostic value" can deliver in the context of clinical information, site of involvement and patient data (e.g. age, sex), support in histopathological differential diagnoses. It can be expected that quantitative virtual microscopy will open new possibilities for automated histological support. Automated integrated quantification of histological slides also serves for quality assurance. The development and theoretical background of morphometric analyses in histopathology are reviewed, as well as their application and potential future implementation in virtual microscopy. Copyright © 2012 Elsevier GmbH. All rights reserved.
Feature extraction via KPCA for classification of gait patterns.
Wu, Jianning; Wang, Jue; Liu, Li
2007-06-01
Automated recognition of gait pattern change is important in medical diagnostics as well as in the early identification of at-risk gait in the elderly. We evaluated the use of Kernel-based Principal Component Analysis (KPCA) to extract more gait features (i.e., to obtain more significant amounts of information about human movement) and thus to improve the classification of gait patterns. 3D gait data of 24 young and 24 elderly participants were acquired using an OPTOTRAK 3020 motion analysis system during normal walking, and a total of 36 gait spatio-temporal and kinematic variables were extracted from the recorded data. KPCA was used first for nonlinear feature extraction to then evaluate its effect on a subsequent classification in combination with learning algorithms such as support vector machines (SVMs). Cross-validation test results indicated that the proposed technique could allow spreading the information about the gait's kinematic structure into more nonlinear principal components, thus providing additional discriminatory information for the improvement of gait classification performance. The feature extraction ability of KPCA was affected slightly with different kernel functions as polynomial and radial basis function. The combination of KPCA and SVM could identify young-elderly gait patterns with 91% accuracy, resulting in a markedly improved performance compared to the combination of PCA and SVM. These results suggest that nonlinear feature extraction by KPCA improves the classification of young-elderly gait patterns, and holds considerable potential for future applications in direct dimensionality reduction and interpretation of multiple gait signals.
McArt, Scott H; Spalinger, Donald E; Kennish, John M; Collins, William B
2006-06-01
The protein precipitation assay used by Robbins et al., (1987) Ecology 68:98-107 has been shown to predict successfully the reduction in protein availability to some ruminants due to tannins. The procedure, however, is expensive and laborious, which limits its utility, especially for quantitative ecological or nutritional applications where large numbers of assays may be required. We have modified the method to decrease its cost and increase laboratory efficiency by: (1) automating the extraction by using Accelerated Solvent Extraction (ASE); and (2) by scaling and automating the precipitation reaction, chromatography, and spectrometry with microplate gel filtration and an automated UV-VIS microplate spectrometer. ASE extraction is shown to be as effective at extracting tannins as the hot methanol technique. Additionally, the microplate assay is sensitive and precise. We show that the results from the new technique correspond in a nearly 1:1 relationship to the results of the previous technique. Hence, this method could reliably replace the older method with no loss in relevance to herbivore protein digestion. Moreover, the ASE extraction technique should be applicable to other tannin-protein precipitation assays and possibly other phenolic assays.
Multichannel Convolutional Neural Network for Biological Relation Extraction.
Quan, Chanqin; Hua, Lei; Sun, Xiao; Bai, Wenjun
2016-01-01
The plethora of biomedical relations which are embedded in medical logs (records) demands researchers' attention. Previous theoretical and practical focuses were restricted on traditional machine learning techniques. However, these methods are susceptible to the issues of "vocabulary gap" and data sparseness and the unattainable automation process in feature extraction. To address aforementioned issues, in this work, we propose a multichannel convolutional neural network (MCCNN) for automated biomedical relation extraction. The proposed model has the following two contributions: (1) it enables the fusion of multiple (e.g., five) versions in word embeddings; (2) the need for manual feature engineering can be obviated by automated feature learning with convolutional neural network (CNN). We evaluated our model on two biomedical relation extraction tasks: drug-drug interaction (DDI) extraction and protein-protein interaction (PPI) extraction. For DDI task, our system achieved an overall f -score of 70.2% compared to the standard linear SVM based system (e.g., 67.0%) on DDIExtraction 2013 challenge dataset. And for PPI task, we evaluated our system on Aimed and BioInfer PPI corpus; our system exceeded the state-of-art ensemble SVM system by 2.7% and 5.6% on f -scores.
A knowledge-based approach to automated planning for hepatocellular carcinoma.
Zhang, Yujie; Li, Tingting; Xiao, Han; Ji, Weixing; Guo, Ming; Zeng, Zhaochong; Zhang, Jianying
2018-01-01
To build a knowledge-based model of liver cancer for Auto-Planning, a function in Pinnacle, which is used as an automated inverse intensity modulated radiation therapy (IMRT) planning system. Fifty Tomotherapy patients were enrolled to extract the dose-volume histograms (DVHs) information and construct the protocol for Auto-Planning model. Twenty more patients were chosen additionally to test the model. Manual planning and automatic planning were performed blindly for all twenty test patients with the same machine and treatment planning system. The dose distributions of target and organs at risks (OARs), along with the working time for planning, were evaluated. Statistically significant results showed that automated plans performed better in target conformity index (CI) while mean target dose was 0.5 Gy higher than manual plans. The differences between target homogeneity indexes (HI) of the two methods were not statistically significant. Additionally, the doses of normal liver, left kidney, and small bowel were significantly reduced with automated plan. Particularly, mean dose and V15 of normal liver were 1.4 Gy and 40.5 cc lower with automated plans respectively. Mean doses of left kidney and small bowel were reduced with automated plans by 1.2 Gy and 2.1 Gy respectively. In contrast, working time was also significantly reduced with automated planning. Auto-Planning shows availability and effectiveness in our knowledge-based model for liver cancer. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.
Automated Fluid Feature Extraction from Transient Simulations
NASA Technical Reports Server (NTRS)
Haimes, Robert
1998-01-01
In the past, feature extraction and identification were interesting concepts, but not required to understand the underlying physics of a steady flow field. This is because the results of the more traditional tools like iso-surfaces, cuts and streamlines were more interactive and easily abstracted so they could be represented to the investigator. These tools worked and properly conveyed the collected information at the expense of much interaction. For unsteady flow-fields, the investigator does not have the luxury of spending time scanning only one 'snap-shot' of the simulation. Automated assistance is required in pointing out areas of potential interest contained within the flow. This must not require a heavy compute burden (the visualization should not significantly slow down the solution procedure for co-processing environments like pV3). And methods must be developed to abstract the feature and display it in a manner that physically makes sense. The following is a list of the important physical phenomena found in transient (and steady-state) fluid flow: Shocks; Vortex ores; Regions of Recirculation; Boundary Layers; Wakes.
Software for Partly Automated Recognition of Targets
NASA Technical Reports Server (NTRS)
Opitz, David; Blundell, Stuart; Bain, William; Morris, Matthew; Carlson, Ian; Mangrich, Mark; Selinsky, T.
2002-01-01
The Feature Analyst is a computer program for assisted (partially automated) recognition of targets in images. This program was developed to accelerate the processing of high-resolution satellite image data for incorporation into geographic information systems (GIS). This program creates an advanced user interface that embeds proprietary machine-learning algorithms in commercial image-processing and GIS software. A human analyst provides samples of target features from multiple sets of data, then the software develops a data-fusion model that automatically extracts the remaining features from selected sets of data. The program thus leverages the natural ability of humans to recognize objects in complex scenes, without requiring the user to explain the human visual recognition process by means of lengthy software. Two major subprograms are the reactive agent and the thinking agent. The reactive agent strives to quickly learn the user's tendencies while the user is selecting targets and to increase the user's productivity by immediately suggesting the next set of pixels that the user may wish to select. The thinking agent utilizes all available resources, taking as much time as needed, to produce the most accurate autonomous feature-extraction model possible.
Automated detection of extended sources in radio maps: progress from the SCORPIO survey
NASA Astrophysics Data System (ADS)
Riggi, S.; Ingallinera, A.; Leto, P.; Cavallaro, F.; Bufano, F.; Schillirò, F.; Trigilio, C.; Umana, G.; Buemi, C. S.; Norris, R. P.
2016-08-01
Automated source extraction and parametrization represents a crucial challenge for the next-generation radio interferometer surveys, such as those performed with the Square Kilometre Array (SKA) and its precursors. In this paper, we present a new algorithm, called CAESAR (Compact And Extended Source Automated Recognition), to detect and parametrize extended sources in radio interferometric maps. It is based on a pre-filtering stage, allowing image denoising, compact source suppression and enhancement of diffuse emission, followed by an adaptive superpixel clustering stage for final source segmentation. A parametrization stage provides source flux information and a wide range of morphology estimators for post-processing analysis. We developed CAESAR in a modular software library, also including different methods for local background estimation and image filtering, along with alternative algorithms for both compact and diffuse source extraction. The method was applied to real radio continuum data collected at the Australian Telescope Compact Array (ATCA) within the SCORPIO project, a pathfinder of the Evolutionary Map of the Universe (EMU) survey at the Australian Square Kilometre Array Pathfinder (ASKAP). The source reconstruction capabilities were studied over different test fields in the presence of compact sources, imaging artefacts and diffuse emission from the Galactic plane and compared with existing algorithms. When compared to a human-driven analysis, the designed algorithm was found capable of detecting known target sources and regions of diffuse emission, outperforming alternative approaches over the considered fields.
Itri, Jason N; Jones, Lisa P; Kim, Woojin; Boonn, William W; Kolansky, Ana S; Hilton, Susan; Zafar, Hanna M
2014-04-01
Monitoring complications and diagnostic yield for image-guided procedures is an important component of maintaining high quality patient care promoted by professional societies in radiology and accreditation organizations such as the American College of Radiology (ACR) and Joint Commission. These outcome metrics can be used as part of a comprehensive quality assurance/quality improvement program to reduce variation in clinical practice, provide opportunities to engage in practice quality improvement, and contribute to developing national benchmarks and standards. The purpose of this article is to describe the development and successful implementation of an automated web-based software application to monitor procedural outcomes for US- and CT-guided procedures in an academic radiology department. The open source tools PHP: Hypertext Preprocessor (PHP) and MySQL were used to extract relevant procedural information from the Radiology Information System (RIS), auto-populate the procedure log database, and develop a user interface that generates real-time reports of complication rates and diagnostic yield by site and by operator. Utilizing structured radiology report templates resulted in significantly improved accuracy of information auto-populated from radiology reports, as well as greater compliance with manual data entry. An automated web-based procedure log database is an effective tool to reliably track complication rates and diagnostic yield for US- and CT-guided procedures performed in a radiology department.
A displacement pump procedure to load extracts for automated gel permeation chromatography.
Daft, J; Hopper, M; Hensley, D; Sisk, R
1990-01-01
Automated gel permeation chromatography (GPC) effectively separates lipids from pesticides in sample extracts that contain fat. Using a large syringe to load sample extracts manually onto GPC models having 5 mL holding loops is awkward, slow, and potentially hazardous. Loading with a small-volume displacement pump, however, is convenient and fast (ca 1 loop every 20 s). And more importantly, the analyst is not exposed to toxic organic vapors because the loading pump and its connecting lines do not leak in the way that a syringe does.
Jipp, Meike
2016-02-01
I explored whether different cognitive abilities (information-processing ability, working-memory capacity) are needed for expertise development when different types of automation (information vs. decision automation) are employed. It is well documented that expertise development and the employment of automation lead to improved performance. Here, it is argued that a learner's ability to reason about an activity may be hindered by the employment of information automation. Additional feedback needs to be processed, thus increasing the load on working memory and decelerating expertise development. By contrast, the employment of decision automation may stimulate reasoning, increase the initial load on information-processing ability, and accelerate expertise development. Authors of past research have not investigated the interrelations between automation assistance, individual differences, and expertise development. Sixty-one naive learners controlled simulated air traffic with two types of automation: information automation and decision automation. Their performance was captured across 16 trials. Well-established tests were used to assess information-processing ability and working-memory capacity. As expected, learners' performance benefited from expertise development and decision automation. Furthermore, individual differences moderated the effect of the type of automation on expertise development: The employment of only information automation increased the load on working memory during later expertise development. The employment of decision automation initially increased the need to process information. These findings highlight the importance of considering individual differences and expertise development when investigating human-automation interaction. The results are relevant for selecting automation configurations for expertise development. © 2015, Human Factors and Ergonomics Society.
Bass, Ellen J; Baumgart, Leigh A; Shepley, Kathryn Klein
2013-03-01
Displaying both the strategy that information analysis automation employs to makes its judgments and variability in the task environment may improve human judgment performance, especially in cases where this variability impacts the judgment performance of the information analysis automation. This work investigated the contribution of providing either information analysis automation strategy information, task environment information, or both, on human judgment performance in a domain where noisy sensor data are used by both the human and the information analysis automation to make judgments. In a simplified air traffic conflict prediction experiment, 32 participants made probability of horizontal conflict judgments under different display content conditions. After being exposed to the information analysis automation, judgment achievement significantly improved for all participants as compared to judgments without any of the automation's information. Participants provided with additional display content pertaining to cue variability in the task environment had significantly higher aided judgment achievement compared to those provided with only the automation's judgment of a probability of conflict. When designing information analysis automation for environments where the automation's judgment achievement is impacted by noisy environmental data, it may be beneficial to show additional task environment information to the human judge in order to improve judgment performance.
Sankar, Martial; Nieminen, Kaisa; Ragni, Laura; Xenarios, Ioannis; Hardtke, Christian S
2014-02-11
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation. DOI: http://dx.doi.org/10.7554/eLife.01567.001.
Automated structure determination of proteins with the SAIL-FLYA NMR method.
Takeda, Mitsuhiro; Ikeya, Teppei; Güntert, Peter; Kainosho, Masatsune
2007-01-01
The labeling of proteins with stable isotopes enhances the NMR method for the determination of 3D protein structures in solution. Stereo-array isotope labeling (SAIL) provides an optimal stereospecific and regiospecific pattern of stable isotopes that yields sharpened lines, spectral simplification without loss of information, and the ability to collect rapidly and evaluate fully automatically the structural restraints required to solve a high-quality solution structure for proteins up to twice as large as those that can be analyzed using conventional methods. Here, we describe a protocol for the preparation of SAIL proteins by cell-free methods, including the preparation of S30 extract and their automated structure analysis using the FLYA algorithm and the program CYANA. Once efficient cell-free expression of the unlabeled or uniformly labeled target protein has been achieved, the NMR sample preparation of a SAIL protein can be accomplished in 3 d. A fully automated FLYA structure calculation can be completed in 1 d on a powerful computer system.
Sankar, Martial; Nieminen, Kaisa; Ragni, Laura; Xenarios, Ioannis; Hardtke, Christian S
2014-01-01
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation. DOI: http://dx.doi.org/10.7554/eLife.01567.001 PMID:24520159
Streamlined structure elucidation of an unknown compound in a pigment formulation.
Yüce, Imanuel; Morlock, Gertrud E
2016-10-21
A fast and reliable quality control is important for ink manufacturers to ensure a constant production grade of mixtures and chemical formulations, and unknown components attract their attention. Structure elucidating techniques seem time-consuming in combination with column-based methods, but especially the low solubility of pigment formulations is challenging the analysis. In contrast, layer chromatography is more tolerant with regard to pigment particles. One PLC plate for NMR and FTIR analyses and one HPTLC plate for recording of high resolution mass spectra, MS/MS spectra and for gathering information on polarity and spectral properties were needed to characterize a structure, exemplarily shown for an unknown component in pigment Red 57:1 to be 3-hydroxy-2-naphtoic acid. A preparative layer chromatography (PLC) workflow was developed that used an Automated Multiple Development 2 (AMD 2) system. The 0.5-mm PLC plate could still be operated in the AMD 2 system and allowed a smooth switch from the analytical to the preparative gradient separation. Through automated gradient development and the resulting focusing of bands, the sharpness of the PLC bands was improved. For NMR, the necessary high load of the target compound on the PLC plate was achieved via a selective solvent extraction that discriminated the polar sample matrix and thus increased the application volume of the extract that could maximally be applied without overloading. By doing so, the yield for NMR analysis was improved by a factor of 9. The effectivity gain through a simple, but thoroughly chosen extraction solvent is often overlooked, and for educational purpose, it was clearly illustrated and demonstrated by an extended solvent screening. Thus, PLC using an automated gradient development after a selective extraction was proven to be a new powerful combination for structural elucidation by NMR. Copyright © 2016 Elsevier B.V. All rights reserved.
Mori, Kensaku; Ota, Shunsuke; Deguchi, Daisuke; Kitasaka, Takayuki; Suenaga, Yasuhito; Iwano, Shingo; Hasegawa, Yosihnori; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi
2009-01-01
This paper presents a method for the automated anatomical labeling of bronchial branches extracted from 3D CT images based on machine learning and combination optimization. We also show applications of anatomical labeling on a bronchoscopy guidance system. This paper performs automated labeling by using machine learning and combination optimization. The actual procedure consists of four steps: (a) extraction of tree structures of the bronchus regions extracted from CT images, (b) construction of AdaBoost classifiers, (c) computation of candidate names for all branches by using the classifiers, (d) selection of best combination of anatomical names. We applied the proposed method to 90 cases of 3D CT datasets. The experimental results showed that the proposed method can assign correct anatomical names to 86.9% of the bronchial branches up to the sub-segmental lobe branches. Also, we overlaid the anatomical names of bronchial branches on real bronchoscopic views to guide real bronchoscopy.
Alexovič, Michal; Horstkotte, Burkhard; Solich, Petr; Sabo, Ján
2016-02-04
Simplicity, effectiveness, swiftness, and environmental friendliness - these are the typical requirements for the state of the art development of green analytical techniques. Liquid phase microextraction (LPME) stands for a family of elegant sample pretreatment and analyte preconcentration techniques preserving these principles in numerous applications. By using only fractions of solvent and sample compared to classical liquid-liquid extraction, the extraction kinetics, the preconcentration factor, and the cost efficiency can be increased. Moreover, significant improvements can be made by automation, which is still a hot topic in analytical chemistry. This review surveys comprehensively and in two parts the developments of automation of non-dispersive LPME methodologies performed in static and dynamic modes. Their advantages and limitations and the reported analytical performances are discussed and put into perspective with the corresponding manual procedures. The automation strategies, techniques, and their operation advantages as well as their potentials are further described and discussed. In this first part, an introduction to LPME and their static and dynamic operation modes as well as their automation methodologies is given. The LPME techniques are classified according to the different approaches of protection of the extraction solvent using either a tip-like (needle/tube/rod) support (drop-based approaches), a wall support (film-based approaches), or microfluidic devices. In the second part, the LPME techniques based on porous supports for the extraction solvent such as membranes and porous media are overviewed. An outlook on future demands and perspectives in this promising area of analytical chemistry is finally given. Copyright © 2015 Elsevier B.V. All rights reserved.
Alexovič, Michal; Horstkotte, Burkhard; Solich, Petr; Sabo, Ján
2016-02-11
A critical overview on automation of modern liquid phase microextraction (LPME) approaches based on the liquid impregnation of porous sorbents and membranes is presented. It is the continuation of part 1, in which non-dispersive LPME techniques based on the use of the extraction phase (EP) in the form of drop, plug, film, or microflow have been surveyed. Compared to the approaches described in part 1, porous materials provide an improved support for the EP. Simultaneously they allow to enlarge its contact surface and to reduce the risk of loss by incident flow or by components of surrounding matrix. Solvent-impregnated membranes or hollow fibres are further ideally suited for analyte extraction with simultaneous or subsequent back-extraction. Their use can therefore improve the procedure robustness and reproducibility as well as it "opens the door" to the new operation modes and fields of application. However, additional work and time are required for membrane replacement and renewed impregnation. Automation of porous support-based and membrane-based approaches plays an important role in the achievement of better reliability, rapidness, and reproducibility compared to manual assays. Automated renewal of the extraction solvent and coupling of sample pretreatment with the detection instrumentation can be named as examples. The different LPME methodologies using impregnated membranes and porous supports for the extraction phase and the different strategies of their automation, and their analytical applications are comprehensively described and discussed in this part. Finally, an outlook on future demands and perspectives of LPME techniques from both parts as a promising area in the field of sample pretreatment is given. Copyright © 2015 Elsevier B.V. All rights reserved.
Automated control of robotic camera tacheometers for measurements of industrial large scale objects
NASA Astrophysics Data System (ADS)
Heimonen, Teuvo; Leinonen, Jukka; Sipola, Jani
2013-04-01
The modern robotic tacheometers equipped with digital cameras (called also imaging total stations) and capable to measure reflectorless offer new possibilities to gather 3d data. In this paper an automated approach for the tacheometer measurements needed in the dimensional control of industrial large scale objects is proposed. There are two new contributions in the approach: the automated extraction of the vital points (i.e. the points to be measured) and the automated fine aiming of the tacheometer. The proposed approach proceeds through the following steps: First the coordinates of the vital points are automatically extracted from the computer aided design (CAD) data. The extracted design coordinates are then used to aim the tacheometer to point out to the designed location of the points, one after another. However, due to the deviations between the designed and the actual location of the points, the aiming need to be adjusted. An automated dynamic image-based look-and-move type servoing architecture is proposed to be used for this task. After a successful fine aiming, the actual coordinates of the point in question can be automatically measured by using the measuring functionalities of the tacheometer. The approach was validated experimentally and noted to be feasible. On average 97 % of the points actually measured in four different shipbuilding measurement cases were indeed proposed to be vital points by the automated extraction algorithm. The accuracy of the results obtained with the automatic control method of the tachoemeter were comparable to the results obtained with the manual control, and also the reliability of the image processing step of the method was found to be high in the laboratory experiments.
NASA Astrophysics Data System (ADS)
Löfgren, Lars; Forsberg, Gun-Britt; Ståhlman, Marcus
2016-06-01
In this study we present a simple and rapid method for tissue lipid extraction. Snap-frozen tissue (15-150 mg) is collected in 2 ml homogenization tubes. 500 μl BUME mixture (butanol:methanol [3:1]) is added and automated homogenization of up to 24 frozen samples at a time in less than 60 seconds is performed, followed by a 5-minute single-phase extraction. After the addition of 500 μl heptane:ethyl acetate (3:1) and 500 μl 1% acetic acid a 5-minute two-phase extraction is performed. Lipids are recovered from the upper phase by automated liquid handling using a standard 96-tip robot. A second two-phase extraction is performed using 500 μl heptane:ethyl acetate (3:1). Validation of the method showed that the extraction recoveries for the investigated lipids, which included sterols, glycerolipids, glycerophospholipids and sphingolipids were similar or better than for the Folch method. We also applied the method for lipid extraction of liver and heart and compared the lipid species profiles with profiles generated after Folch and MTBE extraction. We conclude that the BUME method is superior to the Folch method in terms of simplicity, through-put, automation, solvent consumption, economy, health and environment yet delivering lipid recoveries fully comparable to or better than the Folch method.
Automated smoother for the numerical decoupling of dynamics models.
Vilela, Marco; Borges, Carlos C H; Vinga, Susana; Vasconcelos, Ana Tereza R; Santos, Helena; Voit, Eberhard O; Almeida, Jonas S
2007-08-21
Structure identification of dynamic models for complex biological systems is the cornerstone of their reverse engineering. Biochemical Systems Theory (BST) offers a particularly convenient solution because its parameters are kinetic-order coefficients which directly identify the topology of the underlying network of processes. We have previously proposed a numerical decoupling procedure that allows the identification of multivariate dynamic models of complex biological processes. While described here within the context of BST, this procedure has a general applicability to signal extraction. Our original implementation relied on artificial neural networks (ANN), which caused slight, undesirable bias during the smoothing of the time courses. As an alternative, we propose here an adaptation of the Whittaker's smoother and demonstrate its role within a robust, fully automated structure identification procedure. In this report we propose a robust, fully automated solution for signal extraction from time series, which is the prerequisite for the efficient reverse engineering of biological systems models. The Whittaker's smoother is reformulated within the context of information theory and extended by the development of adaptive signal segmentation to account for heterogeneous noise structures. The resulting procedure can be used on arbitrary time series with a nonstationary noise process; it is illustrated here with metabolic profiles obtained from in-vivo NMR experiments. The smoothed solution that is free of parametric bias permits differentiation, which is crucial for the numerical decoupling of systems of differential equations. The method is applicable in signal extraction from time series with nonstationary noise structure and can be applied in the numerical decoupling of system of differential equations into algebraic equations, and thus constitutes a rather general tool for the reverse engineering of mechanistic model descriptions from multivariate experimental time series.
Automated Extraction of Flow Features
NASA Technical Reports Server (NTRS)
Dorney, Suzanne (Technical Monitor); Haimes, Robert
2005-01-01
Computational Fluid Dynamics (CFD) simulations are routinely performed as part of the design process of most fluid handling devices. In order to efficiently and effectively use the results of a CFD simulation, visualization tools are often used. These tools are used in all stages of the CFD simulation including pre-processing, interim-processing, and post-processing, to interpret the results. Each of these stages requires visualization tools that allow one to examine the geometry of the device, as well as the partial or final results of the simulation. An engineer will typically generate a series of contour and vector plots to better understand the physics of how the fluid is interacting with the physical device. Of particular interest are detecting features such as shocks, re-circulation zones, and vortices (which will highlight areas of stress and loss). As the demand for CFD analyses continues to increase the need for automated feature extraction capabilities has become vital. In the past, feature extraction and identification were interesting concepts, but not required in understanding the physics of a steady flow field. This is because the results of the more traditional tools like; isc-surface, cuts and streamlines, were more interactive and easily abstracted so they could be represented to the investigator. These tools worked and properly conveyed the collected information at the expense of a great deal of interaction. For unsteady flow-fields, the investigator does not have the luxury of spending time scanning only one "snapshot" of the simulation. Automated assistance is required in pointing out areas of potential interest contained within the flow. This must not require a heavy compute burden (the visualization should not significantly slow down the solution procedure for co-processing environments). Methods must be developed to abstract the feature of interest and display it in a manner that physically makes sense.
Automated Extraction of Flow Features
NASA Technical Reports Server (NTRS)
Dorney, Suzanne (Technical Monitor); Haimes, Robert
2004-01-01
Computational Fluid Dynamics (CFD) simulations are routinely performed as part of the design process of most fluid handling devices. In order to efficiently and effectively use the results of a CFD simulation, visualization tools are often used. These tools are used in all stages of the CFD simulation including pre-processing, interim-processing, and post-processing, to interpret the results. Each of these stages requires visualization tools that allow one to examine the geometry of the device, as well as the partial or final results of the simulation. An engineer will typically generate a series of contour and vector plots to better understand the physics of how the fluid is interacting with the physical device. Of particular interest are detecting features such as shocks, recirculation zones, and vortices (which will highlight areas of stress and loss). As the demand for CFD analyses continues to increase the need for automated feature extraction capabilities has become vital. In the past, feature extraction and identification were interesting concepts, but not required in understanding the physics of a steady flow field. This is because the results of the more traditional tools like; iso-surface, cuts and streamlines, were more interactive and easily abstracted so they could be represented to the investigator. These tools worked and properly conveyed the collected information at the expense of a great deal of interaction. For unsteady flow-fields, the investigator does not have the luxury of spending time scanning only one "snapshot" of the simulation. Automated assistance is required in pointing out areas of potential interest contained within the flow. This must not require a heavy compute burden (the visualization should not significantly slow down the solution procedure for (co-processing environments). Methods must be developed to abstract the feature of interest and display it in a manner that physically makes sense.
Extraction of prostatic lumina and automated recognition for prostatic calculus image using PCA-SVM.
Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D Joshua
2011-01-01
Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi.
Testing of a Composite Wavelet Filter to Enhance Automated Target Recognition in SONAR
NASA Technical Reports Server (NTRS)
Chiang, Jeffrey N.
2011-01-01
Automated Target Recognition (ATR) systems aim to automate target detection, recognition, and tracking. The current project applies a JPL ATR system to low resolution SONAR and camera videos taken from Unmanned Underwater Vehicles (UUVs). These SONAR images are inherently noisy and difficult to interpret, and pictures taken underwater are unreliable due to murkiness and inconsistent lighting. The ATR system breaks target recognition into three stages: 1) Videos of both SONAR and camera footage are broken into frames and preprocessed to enhance images and detect Regions of Interest (ROIs). 2) Features are extracted from these ROIs in preparation for classification. 3) ROIs are classified as true or false positives using a standard Neural Network based on the extracted features. Several preprocessing, feature extraction, and training methods are tested and discussed in this report.
Automated Classification of Heritage Buildings for As-Built Bim Using Machine Learning Techniques
NASA Astrophysics Data System (ADS)
Bassier, M.; Vergauwen, M.; Van Genechten, B.
2017-08-01
Semantically rich three dimensional models such as Building Information Models (BIMs) are increasingly used in digital heritage. They provide the required information to varying stakeholders during the different stages of the historic buildings life cyle which is crucial in the conservation process. The creation of as-built BIM models is based on point cloud data. However, manually interpreting this data is labour intensive and often leads to misinterpretations. By automatically classifying the point cloud, the information can be proccesed more effeciently. A key aspect in this automated scan-to-BIM process is the classification of building objects. In this research we look to automatically recognise elements in existing buildings to create compact semantic information models. Our algorithm efficiently extracts the main structural components such as floors, ceilings, roofs, walls and beams despite the presence of significant clutter and occlusions. More specifically, Support Vector Machines (SVM) are proposed for the classification. The algorithm is evaluated using real data of a variety of existing buildings. The results prove that the used classifier recognizes the objects with both high precision and recall. As a result, entire data sets are reliably labelled at once. The approach enables experts to better document and process heritage assets.
Toward designing for trust in database automation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duez, P. P.; Jamieson, G. A.
Appropriate reliance on system automation is imperative for safe and productive work, especially in safety-critical systems. It is unsafe to rely on automation beyond its designed use; conversely, it can be both unproductive and unsafe to manually perform tasks that are better relegated to automated tools. Operator trust in automated tools mediates reliance, and trust appears to affect how operators use technology. As automated agents become more complex, the question of trust in automation is increasingly important. In order to achieve proper use of automation, we must engender an appropriate degree of trust that is sensitive to changes in operatingmore » functions and context. In this paper, we present research concerning trust in automation in the domain of automated tools for relational databases. Lee and See have provided models of trust in automation. One model developed by Lee and See identifies three key categories of information about the automation that lie along a continuum of attributional abstraction. Purpose-, process-and performance-related information serve, both individually and through inferences between them, to describe automation in such a way as to engender r properly-calibrated trust. Thus, one can look at information from different levels of attributional abstraction as a general requirements analysis for information key to appropriate trust in automation. The model of information necessary to engender appropriate trust in automation [1] is a general one. Although it describes categories of information, it does not provide insight on how to determine the specific information elements required for a given automated tool. We have applied the Abstraction Hierarchy (AH) to this problem in the domain of relational databases. The AH serves as a formal description of the automation at several levels of abstraction, ranging from a very abstract purpose-oriented description to a more concrete description of the resources involved in the automated process. The connection between an AH for an automated tool and a list of information elements at the three levels of attributional abstraction is then direct, providing a method for satisfying information requirements for appropriate trust in automation. In this paper, we will present our method for developing specific information requirements for an automated tool, based on a formal analysis of that tool and the models presented by Lee and See. We will show an example of the application of the AH to automation, in the domain of relational database automation, and the resulting set of specific information elements for appropriate trust in the automated tool. Finally, we will comment on the applicability of this approach to the domain of nuclear plant instrumentation. (authors)« less
Delora, Adam; Gonzales, Aaron; Medina, Christopher S; Mitchell, Adam; Mohed, Abdul Faheem; Jacobs, Russell E; Bearer, Elaine L
2016-01-15
Magnetic resonance imaging (MRI) is a well-developed technique in neuroscience. Limitations in applying MRI to rodent models of neuropsychiatric disorders include the large number of animals required to achieve statistical significance, and the paucity of automation tools for the critical early step in processing, brain extraction, which prepares brain images for alignment and voxel-wise statistics. This novel timesaving automation of template-based brain extraction ("skull-stripping") is capable of quickly and reliably extracting the brain from large numbers of whole head images in a single step. The method is simple to install and requires minimal user interaction. This method is equally applicable to different types of MR images. Results were evaluated with Dice and Jacquard similarity indices and compared in 3D surface projections with other stripping approaches. Statistical comparisons demonstrate that individual variation of brain volumes are preserved. A downloadable software package not otherwise available for extraction of brains from whole head images is included here. This software tool increases speed, can be used with an atlas or a template from within the dataset, and produces masks that need little further refinement. Our new automation can be applied to any MR dataset, since the starting point is a template mask generated specifically for that dataset. The method reliably and rapidly extracts brain images from whole head images, rendering them useable for subsequent analytical processing. This software tool will accelerate the exploitation of mouse models for the investigation of human brain disorders by MRI. Copyright © 2015 Elsevier B.V. All rights reserved.
Ali, Anjum A; Dale, Anders M; Badea, Alexandra; Johnson, G Allan
2005-08-15
We present the automated segmentation of magnetic resonance microscopy (MRM) images of the C57BL/6J mouse brain into 21 neuroanatomical structures, including the ventricular system, corpus callosum, hippocampus, caudate putamen, inferior colliculus, internal capsule, globus pallidus, and substantia nigra. The segmentation algorithm operates on multispectral, three-dimensional (3D) MR data acquired at 90-microm isotropic resolution. Probabilistic information used in the segmentation is extracted from training datasets of T2-weighted, proton density-weighted, and diffusion-weighted acquisitions. Spatial information is employed in the form of prior probabilities of occurrence of a structure at a location (location priors) and the pairwise probabilities between structures (contextual priors). Validation using standard morphometry indices shows good consistency between automatically segmented and manually traced data. Results achieved in the mouse brain are comparable with those achieved in human brain studies using similar techniques. The segmentation algorithm shows excellent potential for routine morphological phenotyping of mouse models.
Dockery, C R; Stefan, A R; Nieuwland, A A; Roberson, S N; Baguley, B M; Hendrix, J E; Morgan, S L
2009-08-01
Systematic designed experiments were employed to find the optimum conditions for extraction of direct, reactive, and vat dyes from cotton fibers prior to forensic characterization. Automated microextractions were coupled with measurements of extraction efficiencies on a microplate reader UV-visible spectrophotometer to enable rapid screening of extraction efficiency as a function of solvent composition. Solvent extraction conditions were also developed to be compatible with subsequent forensic characterization of extracted dyes by capillary electrophoresis with UV-visible diode array detection. The capillary electrophoresis electrolyte successfully used in this work consists of 5 mM ammonium acetate in 40:60 acetonitrile-water at pH 9.3, with the addition of sodium dithionite reducing agent to facilitate analysis of vat dyes. The ultimate goal of these research efforts is enhanced discrimination of trace fiber evidence by analysis of extracted dyes.
Automated measurement of birefringence - Development and experimental evaluation of the techniques
NASA Technical Reports Server (NTRS)
Voloshin, A. S.; Redner, A. S.
1989-01-01
Traditional photoelasticity has started to lose its appeal since it requires a well-trained specialist to acquire and interpret results. A spectral-contents-analysis approach may help to revive this old, but still useful technique. Light intensity of the beam passed through the stressed specimen contains all the information necessary to automatically extract the value of retardation. This is done by using a photodiode array to investigate the spectral contents of the light beam. Three different techniques to extract the value of retardation from the spectral contents of the light are discussed and evaluated. An experimental system was built which demonstrates the ability to evaluate retardation values in real time.
Bass, Ellen J.; Baumgart, Leigh A.; Shepley, Kathryn Klein
2014-01-01
Displaying both the strategy that information analysis automation employs to makes its judgments and variability in the task environment may improve human judgment performance, especially in cases where this variability impacts the judgment performance of the information analysis automation. This work investigated the contribution of providing either information analysis automation strategy information, task environment information, or both, on human judgment performance in a domain where noisy sensor data are used by both the human and the information analysis automation to make judgments. In a simplified air traffic conflict prediction experiment, 32 participants made probability of horizontal conflict judgments under different display content conditions. After being exposed to the information analysis automation, judgment achievement significantly improved for all participants as compared to judgments without any of the automation's information. Participants provided with additional display content pertaining to cue variability in the task environment had significantly higher aided judgment achievement compared to those provided with only the automation's judgment of a probability of conflict. When designing information analysis automation for environments where the automation's judgment achievement is impacted by noisy environmental data, it may be beneficial to show additional task environment information to the human judge in order to improve judgment performance. PMID:24847184
Effects of imperfect automation on decision making in a simulated command and control task.
Rovira, Ericka; McGarry, Kathleen; Parasuraman, Raja
2007-02-01
Effects of four types of automation support and two levels of automation reliability were examined. The objective was to examine the differential impact of information and decision automation and to investigate the costs of automation unreliability. Research has shown that imperfect automation can lead to differential effects of stages and levels of automation on human performance. Eighteen participants performed a "sensor to shooter" targeting simulation of command and control. Dependent variables included accuracy and response time of target engagement decisions, secondary task performance, and subjective ratings of mental work-load, trust, and self-confidence. Compared with manual performance, reliable automation significantly reduced decision times. Unreliable automation led to greater cost in decision-making accuracy under the higher automation reliability condition for three different forms of decision automation relative to information automation. At low automation reliability, however, there was a cost in performance for both information and decision automation. The results are consistent with a model of human-automation interaction that requires evaluation of the different stages of information processing to which automation support can be applied. If fully reliable decision automation cannot be guaranteed, designers should provide users with information automation support or other tools that allow for inspection and analysis of raw data.
NASA Astrophysics Data System (ADS)
Rüther, Heinz; Martine, Hagai M.; Mtalo, E. G.
This paper presents a novel approach to semiautomatic building extraction in informal settlement areas from aerial photographs. The proposed approach uses a strategy of delineating buildings by optimising their approximate building contour position. Approximate building contours are derived automatically by locating elevation blobs in digital surface models. Building extraction is then effected by means of the snakes algorithm and the dynamic programming optimisation technique. With dynamic programming, the building contour optimisation problem is realized through a discrete multistage process and solved by the "time-delayed" algorithm, as developed in this work. The proposed building extraction approach is a semiautomatic process, with user-controlled operations linking fully automated subprocesses. Inputs into the proposed building extraction system are ortho-images and digital surface models, the latter being generated through image matching techniques. Buildings are modeled as "lumps" or elevation blobs in digital surface models, which are derived by altimetric thresholding of digital surface models. Initial windows for building extraction are provided by projecting the elevation blobs centre points onto an ortho-image. In the next step, approximate building contours are extracted from the ortho-image by region growing constrained by edges. Approximate building contours thus derived are inputs into the dynamic programming optimisation process in which final building contours are established. The proposed system is tested on two study areas: Marconi Beam in Cape Town, South Africa, and Manzese in Dar es Salaam, Tanzania. Sixty percent of buildings in the study areas have been extracted and verified and it is concluded that the proposed approach contributes meaningfully to the extraction of buildings in moderately complex and crowded informal settlement areas.
Effects of automation of information-processing functions on teamwork.
Wright, Melanie C; Kaber, David B
2005-01-01
We investigated the effects of automation as applied to different stages of information processing on team performance in a complex decision-making task. Forty teams of 2 individuals performed a simulated Theater Defense Task. Four automation conditions were simulated with computer assistance applied to realistic combinations of information acquisition, information analysis, and decision selection functions across two levels of task difficulty. Multiple measures of team effectiveness and team coordination were used. Results indicated different forms of automation have different effects on teamwork. Compared with a baseline condition, an increase in automation of information acquisition led to an increase in the ratio of information transferred to information requested; an increase in automation of information analysis resulted in higher team coordination ratings; and automation of decision selection led to better team effectiveness under low levels of task difficulty but at the cost of higher workload. The results support the use of early and intermediate forms of automation related to acquisition and analysis of information in the design of team tasks. Decision-making automation may provide benefits in more limited contexts. Applications of this research include the design and evaluation of automation in team environments.
NASA Space Engineering Research Center for Utilization of Local Planetary Resources
NASA Technical Reports Server (NTRS)
Ramohalli, Kumar; Lewis, John S.
1989-01-01
Progress toward the goal of exploiting extraterrestrial resources for space missions is documented. Some areas of research included are as follows: Propellant and propulsion optimization; Automation of propellant processing with quantitative simulation; Ore reduction through chlorination and free radical production; Characterization of lunar ilmenite and its simulants; Carbothermal reduction of ilmenite with special reference to microgravity chemical reactor design; Gaseous carbonyl extraction and purification of ferrous metals; Overall energy management; and Information management for space processing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanfilippo, Antonio P.; Franklin, Lyndsey; Tratz, Stephen C.
2008-04-01
Frame Analysis has come to play an increasingly stronger role in the study of social movements in Sociology and Political Science. While significant steps have been made in providing a theory of frames and framing, a systematic characterization of the frame concept is still largely lacking and there are no rec-ognized criteria and methods that can be used to identify and marshal frame evi-dence reliably and in a time and cost effective manner. Consequently, current Frame Analysis work is still too reliant on manual annotation and subjective inter-pretation. The goal of this paper is to present an approach to themore » representation, acquisition and analysis of frame evidence which leverages Content Analysis, In-formation Extraction and Semantic Search methods to provide a systematic treat-ment of a Frame Analysis and automate frame annotation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kurt Derr; Milos Manic
Time and location data play a very significant role in a variety of factory automation scenarios, such as automated vehicles and robots, their navigation, tracking, and monitoring, to services of optimization and security. In addition, pervasive wireless capabilities combined with time and location information are enabling new applications in areas such as transportation systems, health care, elder care, military, emergency response, critical infrastructure, and law enforcement. A person/object in proximity to certain areas for specific durations of time may pose a risk hazard either to themselves, others, or the environment. This paper presents a novel fuzzy based spatio-temporal risk calculationmore » DSTiPE method that an object with wireless communications presents to the environment. The presented Matlab based application for fuzzy spatio-temporal risk cluster extraction is verified on a diagonal vehicle movement example.« less
NASA-ONERA Collaboration on Human Factors in Aviation Accidents and Incidents
NASA Technical Reports Server (NTRS)
Srivastava, Ashok N.; Fabiani, Patrick
2012-01-01
This is the first annual report jointly prepared by NASA and ONERA on the work performed under the agreement to collaborate on a study of the human factors entailed in aviation accidents and incidents, particularly focused on the consequences of decreases in human performance associated with fatigue. The objective of this agreement is to generate reliable, automated procedures that improve understanding of the levels and characteristics of flight-crew fatigue factors whose confluence will likely result in unacceptable crew performance. This study entails the analyses of numerical and textual data collected during operational flights. NASA and ONERA are collaborating on the development and assessment of automated capabilities for extracting operationally significant information from very large, diverse (textual and numerical) databases; much larger than can be handled practically by human experts.
Deleger, Louise; Brodzinski, Holly; Zhai, Haijun; Li, Qi; Lingren, Todd; Kirkendall, Eric S; Alessandrini, Evaline; Solti, Imre
2013-12-01
To evaluate a proposed natural language processing (NLP) and machine-learning based automated method to risk stratify abdominal pain patients by analyzing the content of the electronic health record (EHR). We analyzed the EHRs of a random sample of 2100 pediatric emergency department (ED) patients with abdominal pain, including all with a final diagnosis of appendicitis. We developed an automated system to extract relevant elements from ED physician notes and lab values and to automatically assign a risk category for acute appendicitis (high, equivocal, or low), based on the Pediatric Appendicitis Score. We evaluated the performance of the system against a manually created gold standard (chart reviews by ED physicians) for recall, specificity, and precision. The system achieved an average F-measure of 0.867 (0.869 recall and 0.863 precision) for risk classification, which was comparable to physician experts. Recall/precision were 0.897/0.952 in the low-risk category, 0.855/0.886 in the high-risk category, and 0.854/0.766 in the equivocal-risk category. The information that the system required as input to achieve high F-measure was available within the first 4 h of the ED visit. Automated appendicitis risk categorization based on EHR content, including information from clinical notes, shows comparable performance to physician chart reviewers as measured by their inter-annotator agreement and represents a promising new approach for computerized decision support to promote application of evidence-based medicine at the point of care.
Improving the driver-automation interaction: an approach using automation uncertainty.
Beller, Johannes; Heesen, Matthias; Vollrath, Mark
2013-12-01
The aim of this study was to evaluate whether communicating automation uncertainty improves the driver-automation interaction. A false system understanding of infallibility may provoke automation misuse and can lead to severe consequences in case of automation failure. The presentation of automation uncertainty may prevent this false system understanding and, as was shown by previous studies, may have numerous benefits. Few studies, however, have clearly shown the potential of communicating uncertainty information in driving. The current study fills this gap. We conducted a driving simulator experiment, varying the presented uncertainty information between participants (no uncertainty information vs. uncertainty information) and the automation reliability (high vs.low) within participants. Participants interacted with a highly automated driving system while engaging in secondary tasks and were required to cooperate with the automation to drive safely. Quantile regressions and multilevel modeling showed that the presentation of uncertainty information increases the time to collision in the case of automation failure. Furthermore, the data indicated improved situation awareness and better knowledge of fallibility for the experimental group. Consequently, the automation with the uncertainty symbol received higher trust ratings and increased acceptance. The presentation of automation uncertaintythrough a symbol improves overall driver-automation cooperation. Most automated systems in driving could benefit from displaying reliability information. This display might improve the acceptance of fallible systems and further enhances driver-automation cooperation.
Living systematic reviews: 2. Combining human and machine effort.
Thomas, James; Noel-Storr, Anna; Marshall, Iain; Wallace, Byron; McDonald, Steven; Mavergames, Chris; Glasziou, Paul; Shemilt, Ian; Synnot, Anneliese; Turner, Tari; Elliott, Julian
2017-11-01
New approaches to evidence synthesis, which use human effort and machine automation in mutually reinforcing ways, can enhance the feasibility and sustainability of living systematic reviews. Human effort is a scarce and valuable resource, required when automation is impossible or undesirable, and includes contributions from online communities ("crowds") as well as more conventional contributions from review authors and information specialists. Automation can assist with some systematic review tasks, including searching, eligibility assessment, identification and retrieval of full-text reports, extraction of data, and risk of bias assessment. Workflows can be developed in which human effort and machine automation can each enable the other to operate in more effective and efficient ways, offering substantial enhancement to the productivity of systematic reviews. This paper describes and discusses the potential-and limitations-of new ways of undertaking specific tasks in living systematic reviews, identifying areas where these human/machine "technologies" are already in use, and where further research and development is needed. While the context is living systematic reviews, many of these enabling technologies apply equally to standard approaches to systematic reviewing. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
An annotated corpus with nanomedicine and pharmacokinetic parameters
Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T
2017-01-01
A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. PMID:29066897
Automated DICOM metadata and volumetric anatomical information extraction for radiation dosimetry
NASA Astrophysics Data System (ADS)
Papamichail, D.; Ploussi, A.; Kordolaimi, S.; Karavasilis, E.; Papadimitroulas, P.; Syrgiamiotis, V.; Efstathopoulos, E.
2015-09-01
Patient-specific dosimetry calculations based on simulation techniques have as a prerequisite the modeling of the modality system and the creation of voxelized phantoms. This procedure requires the knowledge of scanning parameters and patients’ information included in a DICOM file as well as image segmentation. However, the extraction of this information is complicated and time-consuming. The objective of this study was to develop a simple graphical user interface (GUI) to (i) automatically extract metadata from every slice image of a DICOM file in a single query and (ii) interactively specify the regions of interest (ROI) without explicit access to the radiology information system. The user-friendly application developed in Matlab environment. The user can select a series of DICOM files and manage their text and graphical data. The metadata are automatically formatted and presented to the user as a Microsoft Excel file. The volumetric maps are formed by interactively specifying the ROIs and by assigning a specific value in every ROI. The result is stored in DICOM format, for data and trend analysis. The developed GUI is easy, fast and and constitutes a very useful tool for individualized dosimetry. One of the future goals is to incorporate a remote access to a PACS server functionality.
Detection of reflecting surfaces by a statistical model
NASA Astrophysics Data System (ADS)
He, Qiang; Chu, Chee-Hung H.
2009-02-01
Remote sensing is widely used assess the destruction from natural disasters and to plan relief and recovery operations. How to automatically extract useful features and segment interesting objects from digital images, including remote sensing imagery, becomes a critical task for image understanding. Unfortunately, current research on automated feature extraction is ignorant of contextual information. As a result, the fidelity of populating attributes corresponding to interesting features and objects cannot be satisfied. In this paper, we present an exploration on meaningful object extraction integrating reflecting surfaces. Detection of specular reflecting surfaces can be useful in target identification and then can be applied to environmental monitoring, disaster prediction and analysis, military, and counter-terrorism. Our method is based on a statistical model to capture the statistical properties of specular reflecting surfaces. And then the reflecting surfaces are detected through cluster analysis.
Discovering gene annotations in biomedical text databases
Cakmak, Ali; Ozsoyoglu, Gultekin
2008-01-01
Background Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. Results In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. Conclusion GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values. PMID:18325104
Discovering gene annotations in biomedical text databases.
Cakmak, Ali; Ozsoyoglu, Gultekin
2008-03-06
Genes and gene products are frequently annotated with Gene Ontology concepts based on the evidence provided in genomics articles. Manually locating and curating information about a genomic entity from the biomedical literature requires vast amounts of human effort. Hence, there is clearly a need forautomated computational tools to annotate the genes and gene products with Gene Ontology concepts by computationally capturing the related knowledge embedded in textual data. In this article, we present an automated genomic entity annotation system, GEANN, which extracts information about the characteristics of genes and gene products in article abstracts from PubMed, and translates the discoveredknowledge into Gene Ontology (GO) concepts, a widely-used standardized vocabulary of genomic traits. GEANN utilizes textual "extraction patterns", and a semantic matching framework to locate phrases matching to a pattern and produce Gene Ontology annotations for genes and gene products. In our experiments, GEANN has reached to the precision level of 78% at therecall level of 61%. On a select set of Gene Ontology concepts, GEANN either outperforms or is comparable to two other automated annotation studies. Use of WordNet for semantic pattern matching improves the precision and recall by 24% and 15%, respectively, and the improvement due to semantic pattern matching becomes more apparent as the Gene Ontology terms become more general. GEANN is useful for two distinct purposes: (i) automating the annotation of genomic entities with Gene Ontology concepts, and (ii) providing existing annotations with additional "evidence articles" from the literature. The use of textual extraction patterns that are constructed based on the existing annotations achieve high precision. The semantic pattern matching framework provides a more flexible pattern matching scheme with respect to "exactmatching" with the advantage of locating approximate pattern occurrences with similar semantics. Relatively low recall performance of our pattern-based approach may be enhanced either by employing a probabilistic annotation framework based on the annotation neighbourhoods in textual data, or, alternatively, the statistical enrichment threshold may be adjusted to lower values for applications that put more value on achieving higher recall values.
ERIC Educational Resources Information Center
Chen, Jing; Zhang, Mo; Bejar, Isaac I.
2017-01-01
Automated essay scoring (AES) generally computes essay scores as a function of macrofeatures derived from a set of microfeatures extracted from the text using natural language processing (NLP). In the "e-rater"® automated scoring engine, developed at "Educational Testing Service" (ETS) for the automated scoring of essays, each…
Li, Jingchao; Cao, Yunpeng; Ying, Yulong; Li, Shuying
2016-01-01
Bearing failure is one of the dominant causes of failure and breakdowns in rotating machinery, leading to huge economic loss. Aiming at the nonstationary and nonlinear characteristics of bearing vibration signals as well as the complexity of condition-indicating information distribution in the signals, a novel rolling element bearing fault diagnosis method based on multifractal theory and gray relation theory was proposed in the paper. Firstly, a generalized multifractal dimension algorithm was developed to extract the characteristic vectors of fault features from the bearing vibration signals, which can offer more meaningful and distinguishing information reflecting different bearing health status in comparison with conventional single fractal dimension. After feature extraction by multifractal dimensions, an adaptive gray relation algorithm was applied to implement an automated bearing fault pattern recognition. The experimental results show that the proposed method can identify various bearing fault types as well as severities effectively and accurately. PMID:28036329
Li, Jingchao; Cao, Yunpeng; Ying, Yulong; Li, Shuying
2016-01-01
Bearing failure is one of the dominant causes of failure and breakdowns in rotating machinery, leading to huge economic loss. Aiming at the nonstationary and nonlinear characteristics of bearing vibration signals as well as the complexity of condition-indicating information distribution in the signals, a novel rolling element bearing fault diagnosis method based on multifractal theory and gray relation theory was proposed in the paper. Firstly, a generalized multifractal dimension algorithm was developed to extract the characteristic vectors of fault features from the bearing vibration signals, which can offer more meaningful and distinguishing information reflecting different bearing health status in comparison with conventional single fractal dimension. After feature extraction by multifractal dimensions, an adaptive gray relation algorithm was applied to implement an automated bearing fault pattern recognition. The experimental results show that the proposed method can identify various bearing fault types as well as severities effectively and accurately.
Mills, M.S.; Thurman, E.M.
1992-01-01
Reversed-phase isolation and ion-exchange purification were combined in the automated solid-phase extraction of two polar s-triazine metabolites, 2-amino-4-chloro-6-(isopropylamino)-s-triazine (deethylatrazine) and 2-amino-4-chloro-6-(ethylamino)-s-triazine (deisopropylatrazine) from clay-loam and slit-loam soils and sandy aquifer sediments. First, methanol/ water (4/1, v/v) soil extracts were transferred to an automated workstation following evaporation of the methanol phase for the rapid reversed-phase isolation of the metabolites on an octadecylresin (C18). The retention of the triazine metabolites on C18 decreased substantially when trace methanol concentrations (1%) remained. Furthermore, the retention on C18 increased with decreasing aqueous solubility and increasing alkyl-chain length of the metabolites and parent herbicides, indicating a reversed-phase interaction. The analytes were eluted with ethyl acetate, which left much of the soil organic-matter impurities on the resin. Second, the small-volume organic eluate was purified on an anion-exchange resin (0.5 mL/min) to extract the remaining soil pigments that could foul the ion source of the GC/MS system. Recoveries of the analytes were 75%, using deuterated atrazine as a surrogate, and were comparable to recoveries by soxhlet extraction. The detection limit was 0.1 ??g/kg with a coefficient of variation of 15%. The ease and efficiency of this automated method makes it viable, practical technique for studying triazine metabolites in the environment.
An automated approach for extracting Barrier Island morphology from digital elevation models
NASA Astrophysics Data System (ADS)
Wernette, Phillipe; Houser, Chris; Bishop, Michael P.
2016-06-01
The response and recovery of a barrier island to extreme storms depends on the elevation of the dune base and crest, both of which can vary considerably alongshore and through time. Quantifying the response to and recovery from storms requires that we can first identify and differentiate the dune(s) from the beach and back-barrier, which in turn depends on accurate identification and delineation of the dune toe, crest and heel. The purpose of this paper is to introduce a multi-scale automated approach for extracting beach, dune (dune toe, dune crest and dune heel), and barrier island morphology. The automated approach introduced here extracts the shoreline and back-barrier shoreline based on elevation thresholds, and extracts the dune toe, dune crest and dune heel based on the average relative relief (RR) across multiple spatial scales of analysis. The multi-scale automated RR approach to extracting dune toe, dune crest, and dune heel based upon relative relief is more objective than traditional approaches because every pixel is analyzed across multiple computational scales and the identification of features is based on the calculated RR values. The RR approach out-performed contemporary approaches and represents a fast objective means to define important beach and dune features for predicting barrier island response to storms. The RR method also does not require that the dune toe, crest, or heel are spatially continuous, which is important because dune morphology is likely naturally variable alongshore.
Klukas, Christian; Chen, Dijun; Pape, Jean-Michel
2014-01-01
High-throughput phenotyping is emerging as an important technology to dissect phenotypic components in plants. Efficient image processing and feature extraction are prerequisites to quantify plant growth and performance based on phenotypic traits. Issues include data management, image analysis, and result visualization of large-scale phenotypic data sets. Here, we present Integrated Analysis Platform (IAP), an open-source framework for high-throughput plant phenotyping. IAP provides user-friendly interfaces, and its core functions are highly adaptable. Our system supports image data transfer from different acquisition environments and large-scale image analysis for different plant species based on real-time imaging data obtained from different spectra. Due to the huge amount of data to manage, we utilized a common data structure for efficient storage and organization of data for both input data and result data. We implemented a block-based method for automated image processing to extract a representative list of plant phenotypic traits. We also provide tools for build-in data plotting and result export. For validation of IAP, we performed an example experiment that contains 33 maize (Zea mays ‘Fernandez’) plants, which were grown for 9 weeks in an automated greenhouse with nondestructive imaging. Subsequently, the image data were subjected to automated analysis with the maize pipeline implemented in our system. We found that the computed digital volume and number of leaves correlate with our manually measured data in high accuracy up to 0.98 and 0.95, respectively. In summary, IAP provides a multiple set of functionalities for import/export, management, and automated analysis of high-throughput plant phenotyping data, and its analysis results are highly reliable. PMID:24760818
The current role of on-line extraction approaches in clinical and forensic toxicology.
Mueller, Daniel M
2014-08-01
In today's clinical and forensic toxicological laboratories, automation is of interest because of its ability to optimize processes, to reduce manual workload and handling errors and to minimize exposition to potentially infectious samples. Extraction is usually the most time-consuming step; therefore, automation of this step is reasonable. Currently, from the field of clinical and forensic toxicology, methods using the following on-line extraction techniques have been published: on-line solid-phase extraction, turbulent flow chromatography, solid-phase microextraction, microextraction by packed sorbent, single-drop microextraction and on-line desorption of dried blood spots. Most of these published methods are either single-analyte or multicomponent procedures; methods intended for systematic toxicological analysis are relatively scarce. However, the use of on-line extraction will certainly increase in the near future.
NASA Astrophysics Data System (ADS)
Garfinkle, Noah W.; Selig, Lucas; Perkins, Timothy K.; Calfas, George W.
2017-05-01
Increasing worldwide internet connectivity and access to sources of print and open social media has increased near realtime availability of textual information. Capabilities to structure and integrate textual data streams can contribute to more meaningful representations of operational environment factors (i.e., Political, Military, Economic, Social, Infrastructure, Information, Physical Environment, and Time [PMESII-PT]) and tactical civil considerations (i.e., Areas, Structures, Capabilities, Organizations, People and Events [ASCOPE]). However, relying upon human analysts to encode this information as it arrives quickly proves intractable. While human analysts possess an ability to comprehend context in unstructured text far beyond that of computers, automated geoparsing (the extraction of locations from unstructured text) can empower analysts to automate sifting through datasets for areas of interest. This research evaluates existing approaches to geoprocessing as well as initiating the research and development of locally-improved methods of tagging parts of text as possible locations, resolving possible locations into coordinates, and interfacing such results with human analysts. The objective of this ongoing research is to develop a more contextually-complete picture of an area of interest (AOI) including human-geographic context for events. In particular, our research is working to make improvements to geoparsing (i.e., the extraction of spatial context from documents), which requires development, integration, and validation of named-entity recognition (NER) tools, gazetteers, and entity-attribution. This paper provides an overview of NER models and methodologies as applied to geoparsing, explores several challenges encountered, presents preliminary results from the creation of a flexible geoparsing research pipeline, and introduces ongoing and future work with the intention of contributing to the efficient geocoding of information containing valuable insights into human activities in space.
Merrill, J; Phillips, A; Keeling, J; Kaushal, R; Senathirajah, Y
2013-01-01
Among the expected benefits of electronic health records (EHRs) is increased reporting of public health information, such as immunization status. State and local immunization registries aid control of vaccine-preventable diseases and help offset fragmentation in healthcare, but reporting is often slow and incomplete. The Primary Care Information Project (PCIP), an initiative of the NYC Department of Health and Mental Hygiene, has implemented EHRs with immunization reporting capability in community settings. To evaluate the effect of automated reporting via an EHR on use and efficiency of reporting to the NY Citywide Immunization Registry, we conducted a secondary analysis of 1.7 million de-identified records submitted between January 2007 and June 2011 by 217 primary care practices enrolled in PCIP, pre and post launch of automated reporting via an EHR. We examined differences in records submitted per day, lag time, and documentation of eligibility for subsidized vaccines. Mean submissions per day did not change. Automated submissions of new and historical records increased by 18% and 98% respectively. Submissions within 14 days increased from 84% to 87%, and within 2 days increased from 60% to 77%. Median lag time decreased from 13 to 10 days. Documentation of eligibility decreased. Results are significant at p<0.001. Significant improvements in registry use and efficiency of reporting were found after launch of automated reporting via an EHR. A decrease in eligibility documentation was attributed to EHR workflow. The limitations to comprehensive evaluation found in these data, which were extracted from a registry initiated prior to widespread EHR implementation suggests that reliable evaluation of immunization reporting via the EHR may require modifications to legacy registry databases.
Kim, Yoonjung; Han, Mi-Soon; Kim, Juwon; Kwon, Aerin; Lee, Kyung-A
2014-01-01
A total of 84 nasopharyngeal swab specimens were collected from 84 patients. Viral nucleic acid was extracted by three automated extraction systems: QIAcube (Qiagen, Germany), EZ1 Advanced XL (Qiagen), and MICROLAB Nimbus IVD (Hamilton, USA). Fourteen RNA viruses and two DNA viruses were detected using the Anyplex II RV16 Detection kit (Seegene, Republic of Korea). The EZ1 Advanced XL system demonstrated the best analytical sensitivity for all the three viral strains. The nucleic acids extracted by EZ1 Advanced XL showed higher positive rates for virus detection than the others. Meanwhile, the MICROLAB Nimbus IVD system was comprised of fully automated steps from nucleic extraction to PCR setup function that could reduce human errors. For the nucleic acids recovered from nasopharyngeal swab specimens, the QIAcube system showed the fewest false negative results and the best concordance rate, and it may be more suitable for detecting various viruses including RNA and DNA virus strains. Each system showed different sensitivity and specificity for detection of certain viral pathogens and demonstrated different characteristics such as turnaround time and sample capacity. Therefore, these factors should be considered when new nucleic acid extraction systems are introduced to the laboratory.
Automated In Vivo Platform for the Discovery of Functional Food Treatments of Hypercholesterolemia
Littleton, Robert M.; Haworth, Kevin J.; Tang, Hong; Setchell, Kenneth D. R.; Nelson, Sandra; Hove, Jay R.
2013-01-01
The zebrafish is becoming an increasingly popular model system for both automated drug discovery and investigating hypercholesterolemia. Here we combine these aspects and for the first time develop an automated high-content confocal assay for treatments of hypercholesterolemia. We also create two algorithms for automated analysis of cardiodynamic data acquired by high-speed confocal microscopy. The first algorithm computes cardiac parameters solely from the frequency-domain representation of cardiodynamic data while the second uses both frequency- and time-domain data. The combined approach resulted in smaller differences relative to manual measurements. The methods are implemented to test the ability of a methanolic extract of the hawthorn plant (Crataegus laevigata) to treat hypercholesterolemia and its peripheral cardiovascular effects. Results demonstrate the utility of these methods and suggest the extract has both antihypercholesterolemic and postitively inotropic properties. PMID:23349685
Automated in vivo platform for the discovery of functional food treatments of hypercholesterolemia.
Littleton, Robert M; Haworth, Kevin J; Tang, Hong; Setchell, Kenneth D R; Nelson, Sandra; Hove, Jay R
2013-01-01
The zebrafish is becoming an increasingly popular model system for both automated drug discovery and investigating hypercholesterolemia. Here we combine these aspects and for the first time develop an automated high-content confocal assay for treatments of hypercholesterolemia. We also create two algorithms for automated analysis of cardiodynamic data acquired by high-speed confocal microscopy. The first algorithm computes cardiac parameters solely from the frequency-domain representation of cardiodynamic data while the second uses both frequency- and time-domain data. The combined approach resulted in smaller differences relative to manual measurements. The methods are implemented to test the ability of a methanolic extract of the hawthorn plant (Crataegus laevigata) to treat hypercholesterolemia and its peripheral cardiovascular effects. Results demonstrate the utility of these methods and suggest the extract has both antihypercholesterolemic and postitively inotropic properties.
Data is presented showing the progress made towards the development of a new automated system combining solid phase extraction (SPE) with gas chromatography/mass spectrometry for the single run analysis of water samples containing a broad range of acid, base and neutral compounds...
Extraction of Prostatic Lumina and Automated Recognition for Prostatic Calculus Image Using PCA-SVM
Wang, Zhuocai; Xu, Xiangmin; Ding, Xiaojun; Xiao, Hui; Huang, Yusheng; Liu, Jian; Xing, Xiaofen; Wang, Hua; Liao, D. Joshua
2011-01-01
Identification of prostatic calculi is an important basis for determining the tissue origin. Computation-assistant diagnosis of prostatic calculi may have promising potential but is currently still less studied. We studied the extraction of prostatic lumina and automated recognition for calculus images. Extraction of lumina from prostate histology images was based on local entropy and Otsu threshold recognition using PCA-SVM and based on the texture features of prostatic calculus. The SVM classifier showed an average time 0.1432 second, an average training accuracy of 100%, an average test accuracy of 93.12%, a sensitivity of 87.74%, and a specificity of 94.82%. We concluded that the algorithm, based on texture features and PCA-SVM, can recognize the concentric structure and visualized features easily. Therefore, this method is effective for the automated recognition of prostatic calculi. PMID:21461364
Automated indexing for making of a newspaper article database
NASA Astrophysics Data System (ADS)
Kamio, Tatsuo
Automated indexing has been widely employed in the process of making newspaper article databases. It is essential to speed up the compiling time of the said databases for the large amount of articles come out daily, and save manpower involved in it, with the aid of computers. However, indexed terms which are extracted by the current automated indexing systems have no links with subject analysis, so that they are not considered to be keywords in a strict sense. Thus, the system of Nihon Keizai Shimbun KK enables to justify keywords to certain extent based on the two clues ; 1) at which location the extracted term occurred, and 2) whether or not subject area of the article corresponds to thesaurus class of the extracted term, by using characteristics peculiar to newspaper articles. Also the experiment of assigning keywords which are not occurred in articles was conducted. The fairly good result was obtained.
Topography-Assisted Electromagnetic Platform for Blood-to-PCR in a Droplet
Chiou, Chi-Han; Shin, Dong Jin; Zhang, Yi; Wang, Tza-Huei
2013-01-01
This paper presents an electromagnetically actuated platform for automated sample preparation and detection of nucleic acids. The proposed platform integrates nucleic acid extraction using silica-coated magnetic particles with real-time polymerase chain reaction (PCR) on a single cartridge. Extraction of genomic material was automated by manipulating magnetic particles in droplets using a series of planar coil electromagnets assisted by topographical features, enabling efficient fluidic processing over a variety of buffers and reagents. The functionality of the platform was demonstrated by performing nucleic acid extraction from whole blood, followed by real-time PCR detection of KRAS oncogene. Automated sample processing from whole blood to PCR-ready droplet was performed in 15 minutes. We took a modular approach of decoupling the modules of magnetic manipulation and optical detection from the device itself, enabling a low-complexity cartridge that operates in tandem with simple external instruments. PMID:23835223
A framework for feature extraction from hospital medical data with applications in risk prediction.
Tran, Truyen; Luo, Wei; Phung, Dinh; Gupta, Sunil; Rana, Santu; Kennedy, Richard Lee; Larkins, Ann; Venkatesh, Svetha
2014-12-30
Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.
Automated Data Cleansing in Data Harvesting and Data Migration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Mark; Vowell, Lance; King, Ian
2011-03-16
In the proposal for this project, we noted how the explosion of digitized information available through corporate databases, data stores and online search systems has resulted in the knowledge worker being bombarded by information. Knowledge workers typically spend more than 20-30% of their time seeking and sorting information, only finding the information 50-60% of the time . This information exists as unstructured, semi-structured and structured data. The problem of information overload is compounded by the production of duplicate or near-duplicate information. In addition, near-duplicate items frequently have different origins, creating a situation in which each item may have unique informationmore » of value, but their differences are not significant enough to justify maintaining them as separate entities. Effective tools can be provided to eliminate duplicate and near-duplicate information. The proposed approach was to extract unique information from data sets and consolidation that information into a single comprehensive file.« less
Quantity and unit extraction for scientific and technical intelligence analysis
NASA Astrophysics Data System (ADS)
David, Peter; Hawes, Timothy
2017-05-01
Scientific and Technical (S and T) intelligence analysts consume huge amounts of data to understand how scientific progress and engineering efforts affect current and future military capabilities. One of the most important types of information S and T analysts exploit is the quantities discussed in their source material. Frequencies, ranges, size, weight, power, and numerous other properties and measurements describing the performance characteristics of systems and the engineering constraints that define them must be culled from source documents before quantified analysis can begin. Automating the process of finding and extracting the relevant quantities from a wide range of S and T documents is difficult because information about quantities and their units is often contained in unstructured text with ad hoc conventions used to convey their meaning. Currently, even simple tasks, such as searching for documents discussing RF frequencies in a band of interest, is a labor intensive and error prone process. This research addresses the challenges facing development of a document processing capability that extracts quantities and units from S and T data, and how Natural Language Processing algorithms can be used to overcome these challenges.
PASTE: patient-centered SMS text tagging in a medication management system.
Stenner, Shane P; Johnson, Kevin B; Denny, Joshua C
2012-01-01
To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages.
O'Connor, Timothy; Rawat, Siddharth; Markman, Adam; Javidi, Bahram
2018-03-01
We propose a compact imaging system that integrates an augmented reality head mounted device with digital holographic microscopy for automated cell identification and visualization. A shearing interferometer is used to produce holograms of biological cells, which are recorded using customized smart glasses containing an external camera. After image acquisition, segmentation is performed to isolate regions of interest containing biological cells in the field-of-view, followed by digital reconstruction of the cells, which is used to generate a three-dimensional (3D) pseudocolor optical path length profile. Morphological features are extracted from the cell's optical path length map, including mean optical path length, coefficient of variation, optical volume, projected area, projected area to optical volume ratio, cell skewness, and cell kurtosis. Classification is performed using the random forest classifier, support vector machines, and K-nearest neighbor, and the results are compared. Finally, the augmented reality device displays the cell's pseudocolor 3D rendering of its optical path length profile, extracted features, and the identified cell's type or class. The proposed system could allow a healthcare worker to quickly visualize cells using augmented reality smart glasses and extract the relevant information for rapid diagnosis. To the best of our knowledge, this is the first report on the integration of digital holographic microscopy with augmented reality devices for automated cell identification and visualization.
Epileptic seizure detection in EEG signal using machine learning techniques.
Jaiswal, Abeg Kumar; Banka, Haider
2018-03-01
Epilepsy is a well-known nervous system disorder characterized by seizures. Electroencephalograms (EEGs), which capture brain neural activity, can detect epilepsy. Traditional methods for analyzing an EEG signal for epileptic seizure detection are time-consuming. Recently, several automated seizure detection frameworks using machine learning technique have been proposed to replace these traditional methods. The two basic steps involved in machine learning are feature extraction and classification. Feature extraction reduces the input pattern space by keeping informative features and the classifier assigns the appropriate class label. In this paper, we propose two effective approaches involving subpattern based PCA (SpPCA) and cross-subpattern correlation-based PCA (SubXPCA) with Support Vector Machine (SVM) for automated seizure detection in EEG signals. Feature extraction was performed using SpPCA and SubXPCA. Both techniques explore the subpattern correlation of EEG signals, which helps in decision-making process. SVM is used for classification of seizure and non-seizure EEG signals. The SVM was trained with radial basis kernel. All the experiments have been carried out on the benchmark epilepsy EEG dataset. The entire dataset consists of 500 EEG signals recorded under different scenarios. Seven different experimental cases for classification have been conducted. The classification accuracy was evaluated using tenfold cross validation. The classification results of the proposed approaches have been compared with the results of some of existing techniques proposed in the literature to establish the claim.
An accelerated solvent extraction (ASE) device was evaluated as a semi-automated means for extracting arsenicals from quality control (QC) samples and DORM-2 [standard reference material (SRM)]. Unlike conventional extraction procedures, the ASE requires that the sample be dispe...
Dynamic Decision-Making in Multi-Task Environments: Theory and Experimental Results.
1981-03-15
The operator’s primary responsibility in this new role is to extract information from his environment, and to integrate it for’ action selection and its...of the human operator from one of a controller to one of a supervisory decision-maker. The operator’s primary responsibility in this new role is to...troller to that of a monitor of multiple tasks, or a supervisor of sev- ~ I eral semi-automated subsystems. The operator’s primary task in these
A review of signals used in sleep analysis
Roebuck, A; Monasterio, V; Gederi, E; Osipov, M; Behar, J; Malhotra, A; Penzel, T; Clifford, GD
2014-01-01
This article presents a review of signals used for measuring physiology and activity during sleep and techniques for extracting information from these signals. We examine both clinical needs and biomedical signal processing approaches across a range of sensor types. Issues with recording and analysing the signals are discussed, together with their applicability to various clinical disorders. Both univariate and data fusion (exploiting the diverse characteristics of the primary recorded signals) approaches are discussed, together with a comparison of automated methods for analysing sleep. PMID:24346125
Competitive-Cooperative Automated Reasoning from Distributed and Multiple Source of Data
NASA Astrophysics Data System (ADS)
Fard, Amin Milani
Knowledge extraction from distributed database systems, have been investigated during past decade in order to analyze billions of information records. In this work a competitive deduction approach in a heterogeneous data grid environment is proposed using classic data mining and statistical methods. By applying a game theory concept in a multi-agent model, we tried to design a policy for hierarchical knowledge discovery and inference fusion. To show the system run, a sample multi-expert system has also been developed.
Open-Source Programming for Automated Generation of Graphene Raman Spectral Maps
NASA Astrophysics Data System (ADS)
Vendola, P.; Blades, M.; Pierre, W.; Jedlicka, S.; Rotkin, S. V.
Raman microscopy is a useful tool for studying the structural characteristics of graphene deposited onto substrates. However, extracting useful information from the Raman spectra requires data processing and 2D map generation. An existing home-built confocal Raman microscope was optimized for graphene samples and programmed to automatically generate Raman spectral maps across a specified area. In particular, an open source data collection scheme was generated to allow the efficient collection and analysis of the Raman spectral data for future use. NSF ECCS-1509786.
Arduino-based automation of a DNA extraction system.
Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won
2015-01-01
There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.
Tso, Ivy F; Rutherford, Saige; Fang, Yu; Angstadt, Mike; Taylor, Stephan F
2018-01-01
How the human brain processes social information is an increasingly researched topic in psychology and neuroscience, advancing our understanding of basic human cognition and psychopathologies. Neuroimaging studies typically seek to isolate one specific aspect of social cognition when trying to map its neural substrates. It is unclear if brain activation elicited by different social cognitive processes and task instructions are also spontaneously elicited by general social information. In this study, we investigated whether these brain regions are evoked by the mere presence of social information using an automated meta-analysis and confirmatory data from an independent study of simple appraisal of social vs. non-social images. Results of 1,000 published fMRI studies containing the keyword of "social" were subject to an automated meta-analysis (http://neurosynth.org). To confirm that significant brain regions in the meta-analysis were driven by a social effect, these brain regions were used as regions of interest (ROIs) to extract and compare BOLD fMRI signals of social vs. non-social conditions in the independent study. The NeuroSynth results indicated that the dorsal and ventral medial prefrontal cortex, posterior cingulate cortex, bilateral amygdala, bilateral occipito-temporal junction, right fusiform gyrus, bilateral temporal pole, and right inferior frontal gyrus are commonly engaged in studies with a prominent social element. The social-non-social contrast in the independent study showed a strong resemblance to the NeuroSynth map. ROI analyses revealed that a social effect was credible in 9 out of the 11 NeuroSynth regions in the independent dataset. The findings support the conclusion that the "social brain" is highly sensitive to the mere presence of social information.
Raterink, Robert-Jan; Witkam, Yoeri; Vreeken, Rob J; Ramautar, Rawi; Hankemeier, Thomas
2014-10-21
In the field of bioanalysis, there is an increasing demand for miniaturized, automated, robust sample pretreatment procedures that can be easily connected to direct-infusion mass spectrometry (DI-MS) in order to allow the high-throughput screening of drugs and/or their metabolites in complex body fluids like plasma. Liquid-Liquid extraction (LLE) is a common sample pretreatment technique often used for complex aqueous samples in bioanalysis. Despite significant developments that have been made in automated and miniaturized LLE procedures, fully automated LLE techniques allowing high-throughput bioanalytical studies on small-volume samples using direct infusion mass spectrometry, have not been matured yet. Here, we introduce a new fully automated micro-LLE technique based on gas-pressure assisted mixing followed by passive phase separation, coupled online to nanoelectrospray-DI-MS. Our method was characterized by varying the gas flow and its duration through the solvent mixture. For evaluation of the analytical performance, four drugs were spiked to human plasma, resulting in highly acceptable precision (RSD down to 9%) and linearity (R(2) ranging from 0.990 to 0.998). We demonstrate that our new method does not only allow the reliable extraction of analytes from small sample volumes of a few microliters in an automated and high-throughput manner, but also performs comparable or better than conventional offline LLE, in which the handling of small volumes remains challenging. Finally, we demonstrate the applicability of our method for drug screening on dried blood spots showing excellent linearity (R(2) of 0.998) and precision (RSD of 9%). In conclusion, we present the proof of principe of a new high-throughput screening platform for bioanalysis based on a new automated microLLE method, coupled online to a commercially available nano-ESI-DI-MS.
Karystianis, George; Thayer, Kristina; Wolfe, Mary; Tsafnat, Guy
2017-06-01
Most data extraction efforts in epidemiology are focused on obtaining targeted information from clinical trials. In contrast, limited research has been conducted on the identification of information from observational studies, a major source for human evidence in many fields, including environmental health. The recognition of key epidemiological information (e.g., exposures) through text mining techniques can assist in the automation of systematic reviews and other evidence summaries. We designed and applied a knowledge-driven, rule-based approach to identify targeted information (study design, participant population, exposure, outcome, confounding factors, and the country where the study was conducted) from abstracts of epidemiological studies included in several systematic reviews of environmental health exposures. The rules were based on common syntactical patterns observed in text and are thus not specific to any systematic review. To validate the general applicability of our approach, we compared the data extracted using our approach versus hand curation for 35 epidemiological study abstracts manually selected for inclusion in two systematic reviews. The returned F-score, precision, and recall ranged from 70% to 98%, 81% to 100%, and 54% to 97%, respectively. The highest precision was observed for exposure, outcome and population (100%) while recall was best for exposure and study design with 97% and 89%, respectively. The lowest recall was observed for the population (54%), which also had the lowest F-score (70%). The generated performance of our text-mining approach demonstrated encouraging results for the identification of targeted information from observational epidemiological study abstracts related to environmental exposures. We have demonstrated that rules based on generic syntactic patterns in one corpus can be applied to other observational study design by simple interchanging the dictionaries aiming to identify certain characteristics (i.e., outcomes, exposures). At the document level, the recognised information can assist in the selection and categorization of studies included in a systematic review. Copyright © 2017 Elsevier Inc. All rights reserved.
Ward, Brodie J; Thornton, Ashleigh; Lay, Brendan; Rosenberg, Michael
2017-01-01
Fundamental movement skill (FMS) assessment remains an important tool in classifying individuals' level of FMS proficiency. The collection of FMS performances for assessment and monitoring has remained unchanged over the last few decades, but new motion capture technologies offer opportunities to automate this process. To achieve this, a greater understanding of the human process of movement skill assessment is required. The authors present the rationale and protocols of a project in which they aim to investigate the visual search patterns and information extraction employed by human assessors during FMS assessment, as well as the implementation of the Kinect system for FMS capture.
The utility of an automated electronic system to monitor and audit transfusion practice.
Grey, D E; Smith, V; Villanueva, G; Richards, B; Augustson, B; Erber, W N
2006-05-01
Transfusion laboratories with transfusion committees have a responsibility to monitor transfusion practice and generate improvements in clinical decision-making and red cell usage. However, this can be problematic and expensive because data cannot be readily extracted from most laboratory information systems. To overcome this problem, we developed and introduced a system to electronically extract and collate extensive amounts of data from two laboratory information systems and to link it with ICD10 clinical codes in a new database using standard information technology. Three data files were generated from two laboratory information systems, ULTRA (version 3.2) and TM, using standard information technology scripts. These were patient pre- and post-transfusion haemoglobin, blood group and antibody screen, and cross match and transfusion data. These data together with ICD10 codes for surgical cases were imported into an MS ACCESS database and linked by means of a unique laboratory number. Queries were then run to extract the relevant information and processed in Microsoft Excel for graphical presentation. We assessed the utility of this data extraction system to audit transfusion practice in a 600-bed adult tertiary hospital over an 18-month period. A total of 52 MB of data were extracted from the two laboratory information systems for the 18-month period and together with 2.0 MB theatre ICD10 data enabled case-specific transfusion information to be generated. The audit evaluated 15,992 blood group and antibody screens, 25,344 cross-matched red cell units and 15,455 transfused red cell units. Data evaluated included cross-matched to transfusion ratios and pre- and post-transfusion haemoglobin levels for a range of clinical diagnoses. Data showed significant differences between clinical units and by ICD10 code. This method to electronically extract large amounts of data and linkage with clinical databases has provided a powerful and sustainable tool for monitoring transfusion practice. It has been successfully used to identify areas requiring education, training and clinical guidance and allows for comparison with national haemoglobin-based transfusion guidelines.
Efficacy Evaluation of Different Wavelet Feature Extraction Methods on Brain MRI Tumor Detection
NASA Astrophysics Data System (ADS)
Nabizadeh, Nooshin; John, Nigel; Kubat, Miroslav
2014-03-01
Automated Magnetic Resonance Imaging brain tumor detection and segmentation is a challenging task. Among different available methods, feature-based methods are very dominant. While many feature extraction techniques have been employed, it is still not quite clear which of feature extraction methods should be preferred. To help improve the situation, we present the results of a study in which we evaluate the efficiency of using different wavelet transform features extraction methods in brain MRI abnormality detection. Applying T1-weighted brain image, Discrete Wavelet Transform (DWT), Discrete Wavelet Packet Transform (DWPT), Dual Tree Complex Wavelet Transform (DTCWT), and Complex Morlet Wavelet Transform (CMWT) methods are applied to construct the feature pool. Three various classifiers as Support Vector Machine, K Nearest Neighborhood, and Sparse Representation-Based Classifier are applied and compared for classifying the selected features. The results show that DTCWT and CMWT features classified with SVM, result in the highest classification accuracy, proving of capability of wavelet transform features to be informative in this application.
Semi-automated 96-well liquid-liquid extraction for quantitation of drugs in biological fluids.
Zhang, N; Hoffman, K L; Li, W; Rossi, D T
2000-02-01
A semi-automated liquid-liquid extraction (LLE) technique for biological fluid sample preparation was introduced for the quantitation of four drugs in rat plasma. All liquid transferring during the sample preparation was automated using a Tomtec Quadra 96 Model 320 liquid handling robot, which processed up to 96 samples in parallel. The samples were either in 96-deep-well plate or tube-rack format. One plate of samples can be prepared in approximately 1.5 h, and the 96-well plate is directly compatible with the autosampler of an LC/MS system. Selection of organic solvents and recoveries are discussed. Also, precision, relative error, linearity and quantitation of the semi automated LLE method are estimated for four example drugs using LC/MS/MS with a multiple reaction monitoring (MRM) approach. The applicability of this method and future directions are evaluated.
An accelerated solvent extraction (ASE) device was evaluated as a semi-automated means of extracting arsenicals from ribbon kelp. Objective was to investigate effect of experimentally controllable ASE parameters (pressure, temperature, static time and solvent composition) on extr...
Purschke, Kirsten; Heinl, Sonja; Lerch, Oliver; Erdmann, Freidoon; Veit, Florian
2016-06-01
The analysis of Δ(9)-tetrahydrocannabinol (THC) and its metabolites 11-hydroxy-Δ(9)-tetrahydrocannabinol (11-OH-THC), and 11-nor-9-carboxy-Δ(9)-tetrahydrocannabinol (THC-COOH) from blood serum is a routine task in forensic toxicology laboratories. For examination of consumption habits, the concentration of the phase I metabolite THC-COOH is used. Recommendations for interpretation of analysis values in medical-psychological assessments (regranting of driver's licenses, Germany) include threshold values for the free, unconjugated THC-COOH. Using a fully automated two-step liquid-liquid extraction, THC, 11-OH-THC, and free, unconjugated THC-COOH were extracted from blood serum, silylated with N-methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA), and analyzed by GC/MS. The automation was carried out by an x-y-z sample robot equipped with modules for shaking, centrifugation, and solvent evaporation. This method was based on a previously developed manual sample preparation method. Validation guidelines of the Society of Toxicological and Forensic Chemistry (GTFCh) were fulfilled for both methods, at which the focus of this article is the automated one. Limits of detection and quantification for THC were 0.3 and 0.6 μg/L, for 11-OH-THC were 0.1 and 0.8 μg/L, and for THC-COOH were 0.3 and 1.1 μg/L, when extracting only 0.5 mL of blood serum. Therefore, the required limit of quantification for THC of 1 μg/L in driving under the influence of cannabis cases in Germany (and other countries) can be reached and the method can be employed in that context. Real and external control samples were analyzed, and a round robin test was passed successfully. To date, the method is employed in the Institute of Legal Medicine in Giessen, Germany, in daily routine. Automation helps in avoiding errors during sample preparation and reduces the workload of the laboratory personnel. Due to its flexibility, the analysis system can be employed for other liquid-liquid extractions as well. To the best of our knowledge, this is the first publication on a comprehensively automated classical liquid-liquid extraction workflow in the field of forensic toxicological analysis. Graphical abstract GC/MS with MPS Dual Head at the Institute of Legal Medicine, Giessen, Germany. Modules from left to right: (quick) Mix (for LLE), wash station, tray 1 (vials for extracts), solvent reservoir, (m) VAP (for extract evaporation), Solvent Filling Station (solvent supply), cooled tray 2 (vials for serum samples), and centrifuge (for phase separation).
A simple automated instrument for DNA extraction in forensic casework.
Montpetit, Shawn A; Fitch, Ian T; O'Donnell, Patrick T
2005-05-01
The Qiagen BioRobot EZ1 is a small, rapid, and reliable automated DNA extraction instrument capable of extracting DNA from up to six samples in as few as 20 min using magnetic bead technology. The San Diego Police Department Crime Laboratory has validated the BioRobot EZ1 for the DNA extraction of evidence and reference samples in forensic casework. The BioRobot EZ1 was evaluated for use on a variety of different evidence sample types including blood, saliva, and semen evidence. The performance of the BioRobot EZ1 with regard to DNA recovery and potential cross-contamination was also assessed. DNA yields obtained with the BioRobot EZ1 were comparable to those from organic extraction. The BioRobot EZ1 was effective at removing PCR inhibitors, which often co-purify with DNA in organic extractions. The incorporation of the BioRobot EZ1 into forensic casework has streamlined the DNA analysis process by reducing the need for labor-intensive phenol-chloroform extractions.
Automated detection of optical counterparts to GRBs with RAPTOR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wozniak, P. R.; Vestrand, W. T.; Evans, S.
2006-05-19
The RAPTOR system (RAPid Telescopes for Optical Response) is an array of several distributed robotic telescopes that automatically respond to GCN localization alerts. Raptor-S is a 0.4-m telescope with 24 arc min. field of view employing a 1k x 1k Marconi CCD detector, and has already detected prompt optical emission from several GRBs within the first minute of the explosion. We present a real-time data analysis and alert system for automated identification of optical transients in Raptor-S GRB response data down to the sensitivity limit of {approx} 19 mag. Our custom data processing pipeline is designed to minimize the timemore » required to reliably identify transients and extract actionable information. The system utilizes a networked PostgreSQL database server for catalog access and distributes email alerts with successful detections.« less
Research highlights: microfluidics meets big data.
Tseng, Peter; Weaver, Westbrook M; Masaeli, Mahdokht; Owsley, Keegan; Di Carlo, Dino
2014-03-07
In this issue we highlight a collection of recent work in which microfluidic parallelization and automation have been employed to address the increasing need for large amounts of quantitative data concerning cellular function--from correlating microRNA levels to protein expression, increasing the throughput and reducing the noise when studying protein dynamics in single-cells, and understanding how signal dynamics encodes information. The painstaking dissection of cellular pathways one protein at a time appears to be coming to an end, leading to more rapid discoveries which will inevitably translate to better cellular control--in producing useful gene products and treating disease at the individual cell level. From these studies it is also clear that development of large scale mutant or fusion libraries, automation of microscopy, image analysis, and data extraction will be key components as microfluidics contributes its strengths to aid systems biology moving forward.
NASA - easyJet Collaboration on the Human Factors Monitoring Program (HFMP) Study
NASA Technical Reports Server (NTRS)
Srivistava, Ashok N.; Barton, Phil
2012-01-01
This is the first annual report jointly prepared by NASA and easyJet on the work performed under the agreement to collaborate on a study of the many factors entailed in flight - and cabin-crew fatigue and documenting the decreases in performance associated with fatigue. The objective of this Agreement is to generate reliable, automated procedures that improve understanding of the levels and characteristics of flight - and cabin-crew fatigue factors, both latent and proximate, whose confluence will likely result in unacceptable flight crew performance. This study entails the analyses of numerical and textual data collected during operational flights. NASA and easyJet are both interested in assessing and testing NASA s automated capabilities for extracting operationally significant information from very large, diverse (textual and numerical) databases, much larger than can be handled practically by human experts.
Automated retinal vessel type classification in color fundus images
NASA Astrophysics Data System (ADS)
Yu, H.; Barriga, S.; Agurto, C.; Nemeth, S.; Bauman, W.; Soliz, P.
2013-02-01
Automated retinal vessel type classification is an essential first step toward machine-based quantitative measurement of various vessel topological parameters and identifying vessel abnormalities and alternations in cardiovascular disease risk analysis. This paper presents a new and accurate automatic artery and vein classification method developed for arteriolar-to-venular width ratio (AVR) and artery and vein tortuosity measurements in regions of interest (ROI) of 1.5 and 2.5 optic disc diameters from the disc center, respectively. This method includes illumination normalization, automatic optic disc detection and retinal vessel segmentation, feature extraction, and a partial least squares (PLS) classification. Normalized multi-color information, color variation, and multi-scale morphological features are extracted on each vessel segment. We trained the algorithm on a set of 51 color fundus images using manually marked arteries and veins. We tested the proposed method in a previously unseen test data set consisting of 42 images. We obtained an area under the ROC curve (AUC) of 93.7% in the ROI of AVR measurement and 91.5% of AUC in the ROI of tortuosity measurement. The proposed AV classification method has the potential to assist automatic cardiovascular disease early detection and risk analysis.
Software for Partly Automated Recognition of Targets
NASA Technical Reports Server (NTRS)
Opitz, David; Blundell, Stuart; Bain, William; Morris, Matthew; Carlson, Ian; Mangrich, Mark
2003-01-01
The Feature Analyst is a computer program for assisted (partially automated) recognition of targets in images. This program was developed to accelerate the processing of high-resolution satellite image data for incorporation into geographic information systems (GIS). This program creates an advanced user interface that embeds proprietary machine-learning algorithms in commercial image-processing and GIS software. A human analyst provides samples of target features from multiple sets of data, then the software develops a data-fusion model that automatically extracts the remaining features from selected sets of data. The program thus leverages the natural ability of humans to recognize objects in complex scenes, without requiring the user to explain the human visual recognition process by means of lengthy software. Two major subprograms are the reactive agent and the thinking agent. The reactive agent strives to quickly learn the user s tendencies while the user is selecting targets and to increase the user s productivity by immediately suggesting the next set of pixels that the user may wish to select. The thinking agent utilizes all available resources, taking as much time as needed, to produce the most accurate autonomous feature-extraction model possible.
Crowdsourcing the Measurement of Interstate Conflict
2016-01-01
Much of the data used to measure conflict is extracted from news reports. This is typically accomplished using either expert coders to quantify the relevant information or machine coders to automatically extract data from documents. Although expert coding is costly, it produces quality data. Machine coding is fast and inexpensive, but the data are noisy. To diminish the severity of this tradeoff, we introduce a method for analyzing news documents that uses crowdsourcing, supplemented with computational approaches. The new method is tested on documents about Militarized Interstate Disputes, and its accuracy ranges between about 68 and 76 percent. This is shown to be a considerable improvement over automated coding, and to cost less and be much faster than expert coding. PMID:27310427
Challenges in Extracting Information From Large Hydrogeophysical-monitoring Datasets
NASA Astrophysics Data System (ADS)
Day-Lewis, F. D.; Slater, L. D.; Johnson, T.
2012-12-01
Over the last decade, new automated geophysical data-acquisition systems have enabled collection of increasingly large and information-rich geophysical datasets. Concurrent advances in field instrumentation, web services, and high-performance computing have made real-time processing, inversion, and visualization of large three-dimensional tomographic datasets practical. Geophysical-monitoring datasets have provided high-resolution insights into diverse hydrologic processes including groundwater/surface-water exchange, infiltration, solute transport, and bioremediation. Despite the high information content of such datasets, extraction of quantitative or diagnostic hydrologic information is challenging. Visual inspection and interpretation for specific hydrologic processes is difficult for datasets that are large, complex, and (or) affected by forcings (e.g., seasonal variations) unrelated to the target hydrologic process. New strategies are needed to identify salient features in spatially distributed time-series data and to relate temporal changes in geophysical properties to hydrologic processes of interest while effectively filtering unrelated changes. Here, we review recent work using time-series and digital-signal-processing approaches in hydrogeophysics. Examples include applications of cross-correlation, spectral, and time-frequency (e.g., wavelet and Stockwell transforms) approaches to (1) identify salient features in large geophysical time series; (2) examine correlation or coherence between geophysical and hydrologic signals, even in the presence of non-stationarity; and (3) condense large datasets while preserving information of interest. Examples demonstrate analysis of large time-lapse electrical tomography and fiber-optic temperature datasets to extract information about groundwater/surface-water exchange and contaminant transport.
Towards the automated analysis and database development of defibrillator data from cardiac arrest.
Eftestøl, Trygve; Sherman, Lawrence D
2014-01-01
During resuscitation of cardiac arrest victims a variety of information in electronic format is recorded as part of the documentation of the patient care contact and in order to be provided for case review for quality improvement. Such review requires considerable effort and resources. There is also the problem of interobserver effects. We show that it is possible to efficiently analyze resuscitation episodes automatically using a minimal set of the available information. A minimal set of variables is defined which describe therapeutic events (compression sequences and defibrillations) and corresponding patient response events (annotated rhythm transitions). From this a state sequence representation of the resuscitation episode is constructed and an algorithm is developed for reasoning with this representation and extract review variables automatically. As a case study, the method is applied to the data abstraction process used in the King County EMS. The automatically generated variables are compared to the original ones with accuracies ≥ 90% for 18 variables and ≥ 85% for the remaining four variables. It is possible to use the information present in the CPR process data recorded by the AED along with rhythm and chest compression annotations to automate the episode review.
Bao, X Y; Huang, W J; Zhang, K; Jin, M; Li, Y; Niu, C Z
2018-04-18
There is a huge amount of diagnostic or treatment information in electronic medical record (EMR), which is a concrete manifestation of clinicians actual diagnosis and treatment details. Plenty of episodes in EMRs, such as complaints, present illness, past history, differential diagnosis, diagnostic imaging, surgical records, reflecting details of diagnosis and treatment in clinical process, adopt Chinese description of natural language. How to extract effective information from these Chinese narrative text data, and organize it into a form of tabular for analysis of medical research, for the practical utilization of clinical data in the real world, is a difficult problem in Chinese medical data processing. Based on the EMRs narrative text data in a tertiary hospital in China, a customized information extracting rules learning, and rule based information extraction methods is proposed. The overall method consists of three steps, which includes: (1) Step 1, a random sample of 600 copies (including the history of present illness, past history, personal history, family history, etc.) of the electronic medical record data, was extracted as raw corpora. With our developed Chinese clinical narrative text annotation platform, the trained clinician and nurses marked the tokens and phrases in the corpora which would be extracted (with a history of diabetes as an example). (2) Step 2, based on the annotated corpora clinical text data, some extraction templates were summarized and induced firstly. Then these templates were rewritten using regular expressions of Perl programming language, as extraction rules. Using these extraction rules as basic knowledge base, we developed extraction packages in Perl, for extracting data from the EMRs text data. In the end, the extracted data items were organized in tabular data format, for later usage in clinical research or hospital surveillance purposes. (3) As the final step of the method, the evaluation and validation of the proposed methods were implemented in the National Clinical Service Data Integration Platform, and we checked the extraction results using artificial verification and automated verification combined, proved the effectiveness of the method. For all the patients with diabetes as diagnosed disease in the Department of Endocrine in the hospital, the medical history episode of these patients showed that, altogether 1 436 patients were dismissed in 2015, and a history of diabetes medical records extraction results showed that the recall rate was 87.6%, the accuracy rate was 99.5%, and F-Score was 0.93. For all the 10% patients (totally 1 223 patients) with diabetes by the dismissed dates of August 2017 in the same department, the extracted diabetes history extraction results showed that the recall rate was 89.2%, the accuracy rate was 99.2%, F-Score was 0.94. This study mainly adopts the combination of natural language processing and rule-based information extraction, and designs and implements an algorithm for extracting customized information from unstructured Chinese electronic medical record text data. It has better results than existing work.
NASA Astrophysics Data System (ADS)
Martinis, Sandro; Clandillon, Stephen; Twele, André; Huber, Claire; Plank, Simon; Maxant, Jérôme; Cao, Wenxi; Caspard, Mathilde; May, Stéphane
2016-04-01
Optical and radar satellite remote sensing have proven to provide essential crisis information in case of natural disasters, humanitarian relief activities and civil security issues in a growing number of cases through mechanisms such as the Copernicus Emergency Management Service (EMS) of the European Commission or the International Charter 'Space and Major Disasters'. The aforementioned programs and initiatives make use of satellite-based rapid mapping services aimed at delivering reliable and accurate crisis information after natural hazards. Although these services are increasingly operational, they need to be continuously updated and improved through research and development (R&D) activities. The principal objective of ASAPTERRA (Advancing SAR and Optical Methods for Rapid Mapping), the ESA-funded R&D project being described here, is to improve, automate and, hence, speed-up geo-information extraction procedures in the context of natural hazards response. This is performed through the development, implementation, testing and validation of novel image processing methods using optical and Synthetic Aperture Radar (SAR) data. The methods are mainly developed based on data of the German radar satellites TerraSAR-X and TanDEM-X, the French satellite missions Pléiades-1A/1B as well as the ESA missions Sentinel-1/2 with the aim to better characterize the potential and limitations of these sensors and their synergy. The resulting algorithms and techniques are evaluated in real case applications during rapid mapping activities. The project is focussed on three types of natural hazards: floods, landslides and fires. Within this presentation an overview of the main methodological developments in each topic is given and demonstrated in selected test areas. The following developments are presented in the context of flood mapping: a fully automated Sentinel-1 based processing chain for detecting open flood surfaces, a method for the improved detection of flooded vegetation in Sentinel-1data using Entropy/Alpha decomposition, unsupervised Wishart Classification, and object-based post-classification as well as semi-automatic approaches for extracting inundated areas and flood traces in rural and urban areas from VHR and HR optical imagery using machine learning techniques. Methodological developments related to fires are the implementation of fast and robust methods for mapping burnt scars using change detection procedures using SAR (Sentinel-1, TerraSAR-X) and HR optical (e.g. SPOT, Sentinel-2) data as well as the extraction of 3D surface and volume change information from Pléiades stereo-pairs. In the context of landslides, fast and transferable change detection procedures based on SAR (TerraSAR-X) and optical (SPOT) data as well methods for extracting the extent of landslides only based on polarimetric VHR SAR (TerraSAR-X) data are presented.
Greenspoon, Susan A; Ban, Jeffrey D; Sykes, Karen; Ballard, Elizabeth J; Edler, Shelley S; Baisden, Melissa; Covington, Brian L
2004-01-01
Robotic systems are commonly utilized for the extraction of database samples. However, the application of robotic extraction to forensic casework samples is a more daunting task. Such a system must be versatile enough to accommodate a wide range of samples that may contain greatly varying amounts of DNA, but it must also pose no more risk of contamination than the manual DNA extraction methods. This study demonstrates that the BioMek 2000 Laboratory Automation Workstation, used in combination with the DNA IQ System, is versatile enough to accommodate the wide range of samples typically encountered by a crime laboratory. The use of a silica coated paramagnetic resin, as with the DNA IQ System, facilitates the adaptation of an open well, hands off, robotic system to the extraction of casework samples since no filtration or centrifugation steps are needed. Moreover, the DNA remains tightly coupled to the silica coated paramagnetic resin for the entire process until the elution step. A short pre-extraction incubation step is necessary prior to loading samples onto the robot and it is at this step that most modifications are made to accommodate the different sample types and substrates commonly encountered with forensic evidentiary samples. Sexual assault (mixed stain) samples, cigarette butts, blood stains, buccal swabs, and various tissue samples were successfully extracted with the BioMek 2000 Laboratory Automation Workstation and the DNA IQ System, with no evidence of contamination throughout the extensive validation studies reported here.
DARHT Multi-intelligence Seismic and Acoustic Data Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stevens, Garrison Nicole; Van Buren, Kendra Lu; Hemez, Francois M.
The purpose of this report is to document the analysis of seismic and acoustic data collected at the Dual-Axis Radiographic Hydrodynamic Test (DARHT) facility at Los Alamos National Laboratory for robust, multi-intelligence decision making. The data utilized herein is obtained from two tri-axial seismic sensors and three acoustic sensors, resulting in a total of nine data channels. The goal of this analysis is to develop a generalized, automated framework to determine internal operations at DARHT using informative features extracted from measurements collected external of the facility. Our framework involves four components: (1) feature extraction, (2) data fusion, (3) classification, andmore » finally (4) robustness analysis. Two approaches are taken for extracting features from the data. The first of these, generic feature extraction, involves extraction of statistical features from the nine data channels. The second approach, event detection, identifies specific events relevant to traffic entering and leaving the facility as well as explosive activities at DARHT and nearby explosive testing sites. Event detection is completed using a two stage method, first utilizing signatures in the frequency domain to identify outliers and second extracting short duration events of interest among these outliers by evaluating residuals of an autoregressive exogenous time series model. Features extracted from each data set are then fused to perform analysis with a multi-intelligence paradigm, where information from multiple data sets are combined to generate more information than available through analysis of each independently. The fused feature set is used to train a statistical classifier and predict the state of operations to inform a decision maker. We demonstrate this classification using both generic statistical features and event detection and provide a comparison of the two methods. Finally, the concept of decision robustness is presented through a preliminary analysis where uncertainty is added to the system through noise in the measurements.« less
Kim, Youngjun; Gobbel, Glenn Temple; Matheny, Michael E; Redd, Andrew; Bray, Bruce E; Heidenreich, Paul; Bolton, Dan; Heavirland, Julia; Kelly, Natalie; Reeves, Ruth; Kalsy, Megha; Goldstein, Mary Kane; Meystre, Stephane M
2018-01-01
Background We developed an accurate, stakeholder-informed, automated, natural language processing (NLP) system to measure the quality of heart failure (HF) inpatient care, and explored the potential for adoption of this system within an integrated health care system. Objective To accurately automate a United States Department of Veterans Affairs (VA) quality measure for inpatients with HF. Methods We automated the HF quality measure Congestive Heart Failure Inpatient Measure 19 (CHI19) that identifies whether a given patient has left ventricular ejection fraction (LVEF) <40%, and if so, whether an angiotensin-converting enzyme inhibitor or angiotensin-receptor blocker was prescribed at discharge if there were no contraindications. We used documents from 1083 unique inpatients from eight VA medical centers to develop a reference standard (RS) to train (n=314) and test (n=769) the Congestive Heart Failure Information Extraction Framework (CHIEF). We also conducted semi-structured interviews (n=15) for stakeholder feedback on implementation of the CHIEF. Results The CHIEF classified each hospitalization in the test set with a sensitivity (SN) of 98.9% and positive predictive value of 98.7%, compared with an RS and SN of 98.5% for available External Peer Review Program assessments. Of the 1083 patients available for the NLP system, the CHIEF evaluated and classified 100% of cases. Stakeholders identified potential implementation facilitators and clinical uses of the CHIEF. Conclusions The CHIEF provided complete data for all patients in the cohort and could potentially improve the efficiency, timeliness, and utility of HF quality measurements. PMID:29335238
Nanthini, B. Suguna; Santhi, B.
2017-01-01
Background: Epilepsy causes when the repeated seizure occurs in the brain. Electroencephalogram (EEG) test provides valuable information about the brain functions and can be useful to detect brain disorder, especially for epilepsy. In this study, application for an automated seizure detection model has been introduced successfully. Materials and Methods: The EEG signals are decomposed into sub-bands by discrete wavelet transform using db2 (daubechies) wavelet. The eight statistical features, the four gray level co-occurrence matrix and Renyi entropy estimation with four different degrees of order, are extracted from the raw EEG and its sub-bands. Genetic algorithm (GA) is used to select eight relevant features from the 16 dimension features. The model has been trained and tested using support vector machine (SVM) classifier successfully for EEG signals. The performance of the SVM classifier is evaluated for two different databases. Results: The study has been experimented through two different analyses and achieved satisfactory performance for automated seizure detection using relevant features as the input to the SVM classifier. Conclusion: Relevant features using GA give better accuracy performance for seizure detection. PMID:28781480
Effects of Automation Types on Air Traffic Controller Situation Awareness and Performance
NASA Technical Reports Server (NTRS)
Sethumadhavan, A.
2009-01-01
The Joint Planning and Development Office has proposed the introduction of automated systems to help air traffic controllers handle the increasing volume of air traffic in the next two decades (JPDO, 2007). Because fully automated systems leave operators out of the decision-making loop (e.g., Billings, 1991), it is important to determine the right level and type of automation that will keep air traffic controllers in the loop. This study examined the differences in the situation awareness (SA) and collision detection performance of individuals when they worked with information acquisition, information analysis, decision and action selection and action implementation automation to control air traffic (Parasuraman, Sheridan, & Wickens, 2000). When the automation was unreliable, the time taken to detect an upcoming collision was significantly longer for all the automation types compared with the information acquisition automation. This poor performance following automation failure was mediated by SA, with lower SA yielding poor performance. Thus, the costs associated with automation failure are greater when automation is applied to higher order stages of information processing. Results have practical implications for automation design and development of SA training programs.
Tools for automating spacecraft ground systems: The Intelligent Command and Control (ICC) approach
NASA Technical Reports Server (NTRS)
Stoffel, A. William; Mclean, David
1996-01-01
The practical application of scripting languages and World Wide Web tools to the support of spacecraft ground system automation, is reported on. The mission activities and the automation tools used at the Goddard Space Flight Center (MD) are reviewed. The use of the Tool Command Language (TCL) and the Practical Extraction and Report Language (PERL) scripting tools for automating mission operations is discussed together with the application of different tools for the Compton Gamma Ray Observatory ground system.
Rinaldi, Fabio; Schneider, Gerold; Kaljurand, Kaarel; Hess, Michael; Andronis, Christos; Konstandi, Ourania; Persidis, Andreas
2007-02-01
The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.
Nguyen, Ha T.; Pearce, Joshua M.; Harrap, Rob; Barber, Gerald
2012-01-01
A methodology is provided for the application of Light Detection and Ranging (LiDAR) to automated solar photovoltaic (PV) deployment analysis on the regional scale. Challenges in urban information extraction and management for solar PV deployment assessment are determined and quantitative solutions are offered. This paper provides the following contributions: (i) a methodology that is consistent with recommendations from existing literature advocating the integration of cross-disciplinary competences in remote sensing (RS), GIS, computer vision and urban environmental studies; (ii) a robust methodology that can work with low-resolution, incomprehensive data and reconstruct vegetation and building separately, but concurrently; (iii) recommendations for future generation of software. A case study is presented as an example of the methodology. Experience from the case study such as the trade-off between time consumption and data quality are discussed to highlight a need for connectivity between demographic information, electrical engineering schemes and GIS and a typical factor of solar useful roofs extracted per method. Finally, conclusions are developed to provide a final methodology to extract the most useful information from the lowest resolution and least comprehensive data to provide solar electric assessments over large areas, which can be adapted anywhere in the world. PMID:22666044
Gandola, Emanuele; Antonioli, Manuela; Traficante, Alessio; Franceschini, Simone; Scardi, Michele; Congestri, Roberta
2016-05-01
Toxigenic cyanobacteria are one of the main health risks associated with water resources worldwide, as their toxins can affect humans and fauna exposed via drinking water, aquaculture and recreation. Microscopy monitoring of cyanobacteria in water bodies and massive growth systems is a routine operation for cell abundance and growth estimation. Here we present ACQUA (Automated Cyanobacterial Quantification Algorithm), a new fully automated image analysis method designed for filamentous genera in Bright field microscopy. A pre-processing algorithm has been developed to highlight filaments of interest from background signals due to other phytoplankton and dust. A spline-fitting algorithm has been designed to recombine interrupted and crossing filaments in order to perform accurate morphometric analysis and to extract the surface pattern information of highlighted objects. In addition, 17 specific pattern indicators have been developed and used as input data for a machine-learning algorithm dedicated to the recognition between five widespread toxic or potentially toxic filamentous genera in freshwater: Aphanizomenon, Cylindrospermopsis, Dolichospermum, Limnothrix and Planktothrix. The method was validated using freshwater samples from three Italian volcanic lakes comparing automated vs. manual results. ACQUA proved to be a fast and accurate tool to rapidly assess freshwater quality and to characterize cyanobacterial assemblages in aquatic environments. Copyright © 2016 Elsevier B.V. All rights reserved.
Ni, Yizhao; Kennebeck, Stephanie; Dexheimer, Judith W; McAneney, Constance M; Tang, Huaxiu; Lingren, Todd; Li, Qi; Zhai, Haijun; Solti, Imre
2015-01-01
Objectives (1) To develop an automated eligibility screening (ES) approach for clinical trials in an urban tertiary care pediatric emergency department (ED); (2) to assess the effectiveness of natural language processing (NLP), information extraction (IE), and machine learning (ML) techniques on real-world clinical data and trials. Data and methods We collected eligibility criteria for 13 randomly selected, disease-specific clinical trials actively enrolling patients between January 1, 2010 and August 31, 2012. In parallel, we retrospectively selected data fields including demographics, laboratory data, and clinical notes from the electronic health record (EHR) to represent profiles of all 202795 patients visiting the ED during the same period. Leveraging NLP, IE, and ML technologies, the automated ES algorithms identified patients whose profiles matched the trial criteria to reduce the pool of candidates for staff screening. The performance was validated on both a physician-generated gold standard of trial–patient matches and a reference standard of historical trial–patient enrollment decisions, where workload, mean average precision (MAP), and recall were assessed. Results Compared with the case without automation, the workload with automated ES was reduced by 92% on the gold standard set, with a MAP of 62.9%. The automated ES achieved a 450% increase in trial screening efficiency. The findings on the gold standard set were confirmed by large-scale evaluation on the reference set of trial–patient matches. Discussion and conclusion By exploiting the text of trial criteria and the content of EHRs, we demonstrated that NLP-, IE-, and ML-based automated ES could successfully identify patients for clinical trials. PMID:25030032
Frégeau, Chantal J; Lett, C Marc; Elliott, Jim; Yensen, Craig; Fourney, Ron M
2008-05-01
An automated process has been developed for the analysis of forensic casework samples using TECAN Genesis RSP 150/8 or Freedom EVO liquid handling workstations equipped exclusively with nondisposable tips. Robot tip cleaning routines have been incorporated strategically within the DNA extraction process as well as at the end of each session. Alternative options were examined for cleaning the tips and different strategies were employed to verify cross-contamination. A 2% sodium hypochlorite wash (1/5th dilution of the 10.8% commercial bleach stock) proved to be the best overall approach for preventing cross-contamination of samples processed using our automated protocol. The bleach wash steps do not adversely impact the short tandem repeat (STR) profiles developed from DNA extracted robotically and allow for major cost savings through the implementation of fixed tips. We have demonstrated that robotic workstations equipped with fixed pipette tips can be used with confidence with properly designed tip washing routines to process casework samples using an adapted magnetic bead extraction protocol.
Martin Fabritius, Marie; Broillet, Alain; König, Stefan; Weinmann, Wolfgang
2018-06-04
Adsorption of volatiles in gaseous phase to activated charcoal strip (ACS) is one possibility for the extraction and concentration of ignitable liquid residues (ILRs) from fire debris in arson investigations. Besides liquid extraction using carbon dioxide or hexane, automated thermo-desorption can be used to transfer adsorbed residues to direct analysis by gas chromatography-mass spectrometry (GC-MS). We present a fire debris analysis work-flow with headspace adsorption of volatiles onto ACS and subsequent automated thermo-desorption (ATD) GC-MS analysis. Only a small portion of the ACS is inserted in the ATD tube for thermal desorption coupled to GC-MS, allowing for subsequent confirmation analysis with another portion of the same ACS. This approach is a promising alternative to the routinely used ACS method with solvent extraction of retained volatiles, and the application to fire debris analysis is demonstrated. Copyright © 2018 Elsevier B.V. All rights reserved.
Towards a characterization of information automation systems on the flight deck
NASA Astrophysics Data System (ADS)
Dudley, Rachel Feddersen
This thesis summarizes research to investigate the characteristics that define information automation systems used on aircraft flight decks and the significant impacts that these characteristics have on pilot performance. Major accomplishments of the work include the development of a set of characteristics that describe information automation systems on the flight deck and an experiment designed to study a subset of these characteristics. Information automation systems on the flight deck are responsible for the collection, processing, analysis, and presentation of data to the flightcrew. These systems pose human factors issues and challenges that must be considered by designers of these systems. Based on a previously developed formal definition of information automation for aircraft flight deck systems, an analysis process was developed and conducted to reach a refined set of information automation characteristics. In this work, characteristics are defined as a set of properties or attributes that describe an information automation system's operation or behavior, which can be used to identify and assess potential human factors issues. Hypotheses were formed for a subset of the characteristics: Automation Visibility, Information Quality, and Display Complexity. An experimental investigation was developed to measure performance impacts related to these characteristics, which showed mixed results of expected and surprising findings, with many interactions. A set of recommendations were then developed based on the experimental observations. Ensuring that the right information is presented to pilots at the right time and in the appropriate manner is the job of flight deck system designers. This work provides a foundation for developing recommendations and guidelines specific to information automation on the flight deck with the goal of improving the design and evaluation of information automation systems before they are implemented.
Automated Point Cloud Correspondence Detection for Underwater Mapping Using AUVs
NASA Technical Reports Server (NTRS)
Hammond, Marcus; Clark, Ashley; Mahajan, Aditya; Sharma, Sumant; Rock, Stephen
2015-01-01
An algorithm for automating correspondence detection between point clouds composed of multibeam sonar data is presented. This allows accurate initialization for point cloud alignment techniques even in cases where accurate inertial navigation is not available, such as iceberg profiling or vehicles with low-grade inertial navigation systems. Techniques from computer vision literature are used to extract, label, and match keypoints between "pseudo-images" generated from these point clouds. Image matches are refined using RANSAC and information about the vehicle trajectory. The resulting correspondences can be used to initialize an iterative closest point (ICP) registration algorithm to estimate accumulated navigation error and aid in the creation of accurate, self-consistent maps. The results presented use multibeam sonar data obtained from multiple overlapping passes of an underwater canyon in Monterey Bay, California. Using strict matching criteria, the method detects 23 between-swath correspondence events in a set of 155 pseudo-images with zero false positives. Using less conservative matching criteria doubles the number of matches but introduces several false positive matches as well. Heuristics based on known vehicle trajectory information are used to eliminate these.
Hosseini, Masoud; Jones, Josette; Faiola, Anthony; Vreeman, Daniel J; Wu, Huanmei; Dixon, Brian E
2017-10-01
Due to the nature of information generation in health care, clinical documents contain duplicate and sometimes conflicting information. Recent implementation of Health Information Exchange (HIE) mechanisms in which clinical summary documents are exchanged among disparate health care organizations can proliferate duplicate and conflicting information. To reduce information overload, a system to automatically consolidate information across multiple clinical summary documents was developed for an HIE network. The system receives any number of Continuity of Care Documents (CCDs) and outputs a single, consolidated record. To test the system, a randomly sampled corpus of 522 CCDs representing 50 unique patients was extracted from a large HIE network. The automated methods were compared to manual consolidation of information for three key sections of the CCD: problems, allergies, and medications. Manual consolidation of 11,631 entries was completed in approximately 150h. The same data were automatically consolidated in 3.3min. The system successfully consolidated 99.1% of problems, 87.0% of allergies, and 91.7% of medications. Almost all of the inaccuracies were caused by issues involving the use of standardized terminologies within the documents to represent individual information entries. This study represents a novel, tested tool for de-duplication and consolidation of CDA documents, which is a major step toward improving information access and the interoperability among information systems. While more work is necessary, automated systems like the one evaluated in this study will be necessary to meet the informatics needs of providers and health systems in the future. Copyright © 2017 Elsevier Inc. All rights reserved.
Senger, Stefan; Bartek, Luca; Papadatos, George; Gaulton, Anna
2015-12-01
First public disclosure of new chemical entities often takes place in patents, which makes them an important source of information. However, with an ever increasing number of patent applications, manual processing and curation on such a large scale becomes even more challenging. An alternative approach better suited for this large corpus of documents is the automated extraction of chemical structures. A number of patent chemistry databases generated by using the latter approach are now available but little is known that can help to manage expectations when using them. This study aims to address this by comparing two such freely available sources, SureChEMBL and IBM SIIP (IBM Strategic Intellectual Property Insight Platform), with manually curated commercial databases. When looking at the percentage of chemical structures successfully extracted from a set of patents, using SciFinder as our reference, 59 and 51 % were also found in our comparison in SureChEMBL and IBM SIIP, respectively. When performing this comparison with compounds as starting point, i.e. establishing if for a list of compounds the databases provide the links between chemical structures and patents they appear in, we obtained similar results. SureChEMBL and IBM SIIP found 62 and 59 %, respectively, of the compound-patent pairs obtained from Reaxys. In our comparison of automatically generated vs. manually curated patent chemistry databases, the former successfully provided approximately 60 % of links between chemical structure and patents. It needs to be stressed that only a very limited number of patents and compound-patent pairs were used for our comparison. Nevertheless, our results will hopefully help to manage expectations of users of patent chemistry databases of this type and provide a useful framework for more studies like ours as well as guide future developments of the workflows used for the automated extraction of chemical structures from patents. The challenges we have encountered whilst performing this study highlight that more needs to be done to make such assessments easier. Above all, more adequate, preferably open access to relevant 'gold standards' is required.
An automated procedure for detection of IDP's dwellings using VHR satellite imagery
NASA Astrophysics Data System (ADS)
Jenerowicz, Malgorzata; Kemper, Thomas; Soille, Pierre
2011-11-01
This paper presents the results for the estimation of dwellings structures in Al Salam IDP Camp, Southern Darfur, based on Very High Resolution multispectral satellite images obtained by implementation of Mathematical Morphology analysis. A series of image processing procedures, feature extraction methods and textural analysis have been applied in order to provide reliable information about dwellings structures. One of the issues in this context is related to similarity of the spectral response of thatched dwellings' roofs and the surroundings in the IDP camps, where the exploitation of multispectral information is crucial. This study shows the advantage of automatic extraction approach and highlights the importance of detailed spatial and spectral information analysis based on multi-temporal dataset. The additional data fusion of high-resolution panchromatic band with lower resolution multispectral bands of WorldView-2 satellite has positive influence on results and thereby can be useful for humanitarian aid agency, providing support of decisions and estimations of population especially in situations when frequent revisits by space imaging system are the only possibility of continued monitoring.
Liu, Yao-Min; Zhang, Feng-Ping; Jiao, Bao-Yu; Rao, Jin-Yu; Leng, Geng
2017-04-14
An automated, home-constructed, and low cost dispersive liquid-liquid microextraction (DLLME) device that directly coupled to a high performance liquid chromatography (HPLC) - cold vapour atomic fluorescence spectroscopy (CVAFS) system was designed and developed for the determination of trace concentrations of methylmercury (MeHg + ), ethylmercury (EtHg + ) and inorganic mercury (Hg 2+ ) in natural waters. With a simple, miniaturized and efficient automated DLLME system, nanogram amounts of these mercury species were extracted from natural water samples and injected into a hyphenated HPLC-CVAFS for quantification. The complete analytical procedure, including chelation, extraction, phase separation, collection and injection of the extracts, as well as HPLC-CVAFS quantification, was automated. Key parameters, such as the type and volume of the chelation, extraction and dispersive solvent, aspiration speed, sample pH, salt effect and matrix effect, were thoroughly investigated. Under the optimum conditions, linear range was 10-1200ngL -1 for EtHg + and 5-450ngL -1 for MeHg + and Hg 2+ . Limits of detection were 3.0ngL -1 for EtHg + and 1.5ngL -1 for MeHg + and Hg 2+ . Reproducibility and recoveries were assessed by spiking three natural water samples with different Hg concentrations, giving recoveries from 88.4-96.1%, and relative standard deviations <5.1%. Copyright © 2017 Elsevier B.V. All rights reserved.
Rovira, Ericka; Cross, Austin; Leitch, Evan; Bonaceto, Craig
2014-09-01
The impact of a decision support tool designed to embed contextual mission factors was investigated. Contextual information may enable operators to infer the appropriateness of data underlying the automation's algorithm. Research has shown the costs of imperfect automation are more detrimental than perfectly reliable automation when operators are provided with decision support tools. Operators may trust and rely on the automation more appropriately if they understand the automation's algorithm. The need to develop decision support tools that are understandable to the operator provides the rationale for the current experiment. A total of 17 participants performed a simulated rapid retasking of intelligence, surveillance, and reconnaissance (ISR) assets task with manual, decision automation, or contextual decision automation differing in two levels of task demand: low or high. Automation reliability was set at 80%, resulting in participants experiencing a mixture of reliable and automation failure trials. Dependent variables included ISR coverage and response time of replanning routes. Reliable automation significantly improved ISR coverage when compared with manual performance. Although performance suffered under imperfect automation, contextual decision automation helped to reduce some of the decrements in performance. Contextual information helps overcome the costs of imperfect decision automation. Designers may mitigate some of the performance decrements experienced with imperfect automation by providing operators with interfaces that display contextual information, that is, the state of factors that affect the reliability of the automation's recommendation.
NASA Astrophysics Data System (ADS)
Milgram, David L.; Kahn, Philip; Conner, Gary D.; Lawton, Daryl T.
1988-12-01
The goal of this effort is to develop and demonstrate prototype processing capabilities for a knowledge-based system to automatically extract and analyze features from Synthetic Aperture Radar (SAR) imagery. This effort constitutes Phase 2 funding through the Defense Small Business Innovative Research (SBIR) Program. Previous work examined the feasibility of and technology issues involved in the development of an automated linear feature extraction system. This final report documents this examination and the technologies involved in automating this image understanding task. In particular, it reports on a major software delivery containing an image processing algorithmic base, a perceptual structures manipulation package, a preliminary hypothesis management framework and an enhanced user interface.
PASTE: patient-centered SMS text tagging in a medication management system
Johnson, Kevin B; Denny, Joshua C
2011-01-01
Objective To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Design Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. Measurements A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Results Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Conclusion Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages. PMID:21984605
Aviation obstacle auto-extraction using remote sensing information
NASA Astrophysics Data System (ADS)
Zimmer, N.; Lugsch, W.; Ravenscroft, D.; Schiefele, J.
2008-10-01
An Obstacle, in the aviation context, may be any natural, man-made, fixed or movable object, permanent or temporary. Currently, the most common way to detect relevant aviation obstacles from an aircraft or helicopter for navigation purposes and collision avoidance is the use of merged infrared and synthetic information of obstacle data. Several algorithms have been established to utilize synthetic and infrared images to generate obstacle information. There might be a situation however where the system is error-prone and may not be able to consistently determine the current environment. This situation can be avoided when the system knows the true position of the obstacle. The quality characteristics of the obstacle data strongly depends on the quality of the source data such as maps and official publications. In some countries such as newly industrializing and developing countries, quality and quantity of obstacle information is not available. The aviation world has two specifications - RTCA DO-276A and ICAO ANNEX 15 Ch. 10 - which describe the requirements for aviation obstacles. It is essential to meet these requirements to be compliant with the specifications and to support systems based on these specifications, e.g. 3D obstacle warning systems where accurate coordinates based on WGS-84 is a necessity. Existing aerial and satellite or soon to exist high quality remote sensing data makes it feasible to think about automated aviation obstacle data origination. This paper will describe the feasibility to auto-extract aviation obstacles from remote sensing data considering limitations of image and extraction technologies. Quality parameters and possible resolution of auto-extracted obstacle data will be discussed and presented.
Duval, Kristin; Aubin, Rémy A; Elliott, James; Gorn-Hondermann, Ivan; Birnboim, H Chaim; Jonker, Derek; Fourney, Ron M; Frégeau, Chantal J
2010-02-01
Archival tissue preserved in fixative constitutes an invaluable resource for histological examination, molecular diagnostic procedures and for DNA typing analysis in forensic investigations. However, available material is often limited in size and quantity. Moreover, recovery of DNA is often severely compromised by the presence of covalent DNA-protein cross-links generated by formalin, the most prevalent fixative. We describe the evaluation of buffer formulations, sample lysis regimens and DNA recovery strategies and define optimized manual and automated procedures for the extraction of high quality DNA suitable for molecular diagnostics and genotyping. Using a 3-step enzymatic digestion protocol carried out in the absence of dithiothreitol, we demonstrate that DNA can be efficiently released from cells or tissues preserved in buffered formalin or the alcohol-based fixative GenoFix. This preparatory procedure can then be integrated to traditional phenol/chloroform extraction, a modified manual DNA IQ or automated DNA IQ/Te-Shake-based extraction in order to recover DNA for downstream applications. Quantitative recovery of high quality DNA was best achieved from specimens archived in GenoFix and extracted using magnetic bead capture.
ACME, a GIS tool for Automated Cirque Metric Extraction
NASA Astrophysics Data System (ADS)
Spagnolo, Matteo; Pellitero, Ramon; Barr, Iestyn D.; Ely, Jeremy C.; Pellicer, Xavier M.; Rea, Brice R.
2017-02-01
Regional scale studies of glacial cirque metrics provide key insights on the (palaeo) environment related to the formation of these erosional landforms. The growing availability of high resolution terrain models means that more glacial cirques can be identified and mapped in the future. However, the extraction of their metrics still largely relies on time consuming manual techniques or the combination of, more or less obsolete, GIS tools. In this paper, a newly coded toolbox is provided for the automated, and comparatively quick, extraction of 16 key glacial cirque metrics; including length, width, circularity, planar and 3D area, elevation, slope, aspect, plan closure and hypsometry. The set of tools, named ACME (Automated Cirque Metric Extraction), is coded in Python, runs in one of the most commonly used GIS packages (ArcGIS) and has a user friendly interface. A polygon layer of mapped cirques is required for all metrics, while a Digital Terrain Model and a point layer of cirque threshold midpoints are needed to run some of the tools. Results from ACME are comparable to those from other techniques and can be obtained rapidly, allowing large cirque datasets to be analysed and potentially important regional trends highlighted.
NASA Astrophysics Data System (ADS)
Dong, Di; Li, Ziwei; Liu, Zhaoqin; Yu, Yang
2014-03-01
This paper focuses on automated extraction and monitoring of coastlines by remote sensing techniques using multi-temporal Landsat imagery along Caofeidian, China. Caofeidian, as one of the active economic regions in China, has experienced dramatic change due to enhanced human activities, such as land reclamation. These processes have caused morphological changes of the Caofeidian shoreline. In this study, shoreline extraction and change analysis are researched. An algorithm based on image texture and mathematical morphology is proposed to automate coastline extraction. We tested this approach and found that it's capable of extracting coastlines from TM and ETM+ images with little human modifications. Then, the detected coastline vectors are imported into Arcgis software, and the Digital Shoreline Analysis System (DSAS) is used to calculate the change rate (the end point rate and linear regression rate). The results show that in some parts of the research area, remarkable coastline changes are observed, especially the accretion rate. The abnormal accretion is mostly attributed to the large-scale land reclamation during 2003 and 2004 in Caofeidian. So we can conclude that various construction projects, especially the land reclamation project, have made Caofeidian shorelines change greatly, far above the normal.
ChemEngine: harvesting 3D chemical structures of supplementary data from PDF files.
Karthikeyan, Muthukumarasamy; Vyas, Renu
2016-01-01
Digital access to chemical journals resulted in a vast array of molecular information that is now available in the supplementary material files in PDF format. However, extracting this molecular information, generally from a PDF document format is a daunting task. Here we present an approach to harvest 3D molecular data from the supporting information of scientific research articles that are normally available from publisher's resources. In order to demonstrate the feasibility of extracting truly computable molecules from PDF file formats in a fast and efficient manner, we have developed a Java based application, namely ChemEngine. This program recognizes textual patterns from the supplementary data and generates standard molecular structure data (bond matrix, atomic coordinates) that can be subjected to a multitude of computational processes automatically. The methodology has been demonstrated via several case studies on different formats of coordinates data stored in supplementary information files, wherein ChemEngine selectively harvested the atomic coordinates and interpreted them as molecules with high accuracy. The reusability of extracted molecular coordinate data was demonstrated by computing Single Point Energies that were in close agreement with the original computed data provided with the articles. It is envisaged that the methodology will enable large scale conversion of molecular information from supplementary files available in the PDF format into a collection of ready- to- compute molecular data to create an automated workflow for advanced computational processes. Software along with source codes and instructions available at https://sourceforge.net/projects/chemengine/files/?source=navbar.Graphical abstract.
Characterizing rainfall in the Tenerife island
NASA Astrophysics Data System (ADS)
Díez-Sierra, Javier; del Jesus, Manuel; Losada Rodriguez, Inigo
2017-04-01
In many locations, rainfall data are collected through networks of meteorological stations. The data collection process is nowadays automated in many places, leading to the development of big databases of rainfall data covering extensive areas of territory. However, managers, decision makers and engineering consultants tend not to extract most of the information contained in these databases due to the lack of specific software tools for their exploitation. Here we present the modeling and development effort put in place in the Tenerife island in order to develop MENSEI-L, a software tool capable of automatically analyzing a complete rainfall database to simplify the extraction of information from observations. MENSEI-L makes use of weather type information derived from atmospheric conditions to separate the complete time series into homogeneous groups where statistical distributions are fitted. Normal and extreme regimes are obtained in this manner. MENSEI-L is also able to complete missing data in the time series and to generate synthetic stations by using Kriging techniques. These techniques also serve to generate the spatial regimes of precipitation, both normal and extreme ones. MENSEI-L makes use of weather type information to also provide a stochastic three-day probability forecast for rainfall.
BioEve Search: A Novel Framework to Facilitate Interactive Literature Search
Ahmed, Syed Toufeeq; Davulcu, Hasan; Tikves, Sukru; Nair, Radhika; Zhao, Zhongming
2012-01-01
Background. Recent advances in computational and biological methods in last two decades have remarkably changed the scale of biomedical research and with it began the unprecedented growth in both the production of biomedical data and amount of published literature discussing it. An automated extraction system coupled with a cognitive search and navigation service over these document collections would not only save time and effort, but also pave the way to discover hitherto unknown information implicitly conveyed in the texts. Results. We developed a novel framework (named “BioEve”) that seamlessly integrates Faceted Search (Information Retrieval) with Information Extraction module to provide an interactive search experience for the researchers in life sciences. It enables guided step-by-step search query refinement, by suggesting concepts and entities (like genes, drugs, and diseases) to quickly filter and modify search direction, and thereby facilitating an enriched paradigm where user can discover related concepts and keywords to search while information seeking. Conclusions. The BioEve Search framework makes it easier to enable scalable interactive search over large collection of textual articles and to discover knowledge hidden in thousands of biomedical literature articles with ease. PMID:22693501
Deep learning in color: towards automated quark/gluon jet discrimination
Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
2017-01-25
Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. Here, to establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, themore » deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.« less
Deep learning in color: towards automated quark/gluon jet discrimination
NASA Astrophysics Data System (ADS)
Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
2017-01-01
Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. To establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, the deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.
NASA Astrophysics Data System (ADS)
Balasubramanian, Priya S.; Guo, Jiaqi; Yao, Xinwen; Qu, Dovina; Lu, Helen H.; Hendon, Christine P.
2017-02-01
The directionality of collagen fibers across the anterior cruciate ligament (ACL) as well as the insertion of this key ligament into bone are important for understanding the mechanical integrity and functionality of this complex tissue. Quantitative analysis of three-dimensional fiber directionality is of particular interest due to the physiological, mechanical, and biological heterogeneity inherent across the ACL-to-bone junction, the behavior of the ligament under mechanical stress, and the usefulness of this information in designing tissue engineered grafts. We have developed an algorithm to characterize Optical Coherence Tomography (OCT) image volumes of the ACL. We present an automated algorithm for measuring ligamentous fiber angles, and extracting attenuation and backscattering coefficients of ligament, interface, and bone regions within mature and immature bovine ACL insertion samples. Future directions include translating this algorithm for real time processing to allow three-dimensional volumetric analysis within dynamically moving samples.
Uranga, Jon; Arrizabalaga, Haritz; Boyra, Guillermo; Hernandez, Maria Carmen; Goñi, Nicolas; Arregui, Igor; Fernandes, Jose A; Yurramendi, Yosu; Santiago, Josu
2017-01-01
This study presents a methodology for the automated analysis of commercial medium-range sonar signals for detecting presence/absence of bluefin tuna (Tunnus thynnus) in the Bay of Biscay. The approach uses image processing techniques to analyze sonar screenshots. For each sonar image we extracted measurable regions and analyzed their characteristics. Scientific data was used to classify each region into a class ("tuna" or "no-tuna") and build a dataset to train and evaluate classification models by using supervised learning. The methodology performed well when validated with commercial sonar screenshots, and has the potential to automatically analyze high volumes of data at a low cost. This represents a first milestone towards the development of acoustic, fishery-independent indices of abundance for bluefin tuna in the Bay of Biscay. Future research lines and additional alternatives to inform stock assessments are also discussed.
Deep learning in color: towards automated quark/gluon jet discrimination
DOE Office of Scientific and Technical Information (OSTI.GOV)
Komiske, Patrick T.; Metodiev, Eric M.; Schwartz, Matthew D.
Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. Here, to establish its prospects, we explore to what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon the paradigm that a jet can be treated as an image, with intensity given by the local calorimeter deposits. We supplement this construction by adding color to the images, with red, green and blue intensities given by the transverse momentum in charged particles, transverse momentum in neutral particles, and pixel-level charged particle counts. Overall, themore » deep networks match or outperform traditional jet variables. We also find that, while various simulations produce different quark and gluon jets, the neural networks are surprisingly insensitive to these differences, similar to traditional observables. This suggests that the networks can extract robust physical information from imperfect simulations.« less
Automated metastatic brain lesion detection: a computer aided diagnostic and clinical research tool
NASA Astrophysics Data System (ADS)
Devine, Jeremy; Sahgal, Arjun; Karam, Irene; Martel, Anne L.
2016-03-01
The accurate localization of brain metastases in magnetic resonance (MR) images is crucial for patients undergoing stereotactic radiosurgery (SRS) to ensure that all neoplastic foci are targeted. Computer automated tumor localization and analysis can improve both of these tasks by eliminating inter and intra-observer variations during the MR image reading process. Lesion localization is accomplished using adaptive thresholding to extract enhancing objects. Each enhancing object is represented as a vector of features which includes information on object size, symmetry, position, shape, and context. These vectors are then used to train a random forest classifier. We trained and tested the image analysis pipeline on 3D axial contrast-enhanced MR images with the intention of localizing the brain metastases. In our cross validation study and at the most effective algorithm operating point, we were able to identify 90% of the lesions at a precision rate of 60%.
Natural language processing of spoken diet records (SDRs).
Lacson, Ronilda; Long, William
2006-01-01
Dietary assessment is a fundamental aspect of nutritional evaluation that is essential for management of obesity as well as for assessing dietary impact on chronic diseases. Various methods have been used for dietary assessment including written records, 24-hour recalls, and food frequency questionnaires. The use of mobile phones to provide real-time dietary records provides potential advantages for accessibility, ease of use and automated documentation. However, understanding even a perfect transcript of spoken dietary records (SDRs) is challenging for people. This work presents a first step towards automatic analysis of SDRs. Our approach consists of four steps - identification of food items, identification of food quantifiers, classification of food quantifiers and temporal annotation. Our method enables automatic extraction of dietary information from SDRs, which in turn allows automated mapping to a Diet History Questionnaire dietary database. Our model has an accuracy of 90%. This work demonstrates the feasibility of automatically processing SDRs.
Automated recommendation for cervical cancer screening and surveillance.
Wagholikar, Kavishwar B; MacLaughlin, Kathy L; Casey, Petra M; Kastner, Thomas M; Henry, Michael R; Hankey, Ronald A; Peters, Steve G; Greenes, Robert A; Chute, Christopher G; Liu, Hongfang; Chaudhry, Rajeev
2014-01-01
Because of the complexity of cervical cancer prevention guidelines, clinicians often fail to follow best-practice recommendations. Moreover, existing clinical decision support (CDS) systems generally recommend a cervical cytology every three years for all female patients, which is inappropriate for patients with abnormal findings that require surveillance at shorter intervals. To address this problem, we developed a decision tree-based CDS system that integrates national guidelines to provide comprehensive guidance to clinicians. Validation was performed in several iterations by comparing recommendations generated by the system with those of clinicians for 333 patients. The CDS system extracted relevant patient information from the electronic health record and applied the guideline model with an overall accuracy of 87%. Providers without CDS assistance needed an average of 1 minute 39 seconds to decide on recommendations for management of abnormal findings. Overall, our work demonstrates the feasibility and potential utility of automated recommendation system for cervical cancer screening and surveillance.
Uranga, Jon; Arrizabalaga, Haritz; Boyra, Guillermo; Hernandez, Maria Carmen; Goñi, Nicolas; Arregui, Igor; Fernandes, Jose A.; Yurramendi, Yosu; Santiago, Josu
2017-01-01
This study presents a methodology for the automated analysis of commercial medium-range sonar signals for detecting presence/absence of bluefin tuna (Tunnus thynnus) in the Bay of Biscay. The approach uses image processing techniques to analyze sonar screenshots. For each sonar image we extracted measurable regions and analyzed their characteristics. Scientific data was used to classify each region into a class (“tuna” or “no-tuna”) and build a dataset to train and evaluate classification models by using supervised learning. The methodology performed well when validated with commercial sonar screenshots, and has the potential to automatically analyze high volumes of data at a low cost. This represents a first milestone towards the development of acoustic, fishery-independent indices of abundance for bluefin tuna in the Bay of Biscay. Future research lines and additional alternatives to inform stock assessments are also discussed. PMID:28152032
Multilevel Contextual 3-D CNNs for False Positive Reduction in Pulmonary Nodule Detection.
Dou, Qi; Chen, Hao; Yu, Lequan; Qin, Jing; Heng, Pheng-Ann
2017-07-01
False positive reduction is one of the most crucial components in an automated pulmonary nodule detection system, which plays an important role in lung cancer diagnosis and early treatment. The objective of this paper is to effectively address the challenges in this task and therefore to accurately discriminate the true nodules from a large number of candidates. We propose a novel method employing three-dimensional (3-D) convolutional neural networks (CNNs) for false positive reduction in automated pulmonary nodule detection from volumetric computed tomography (CT) scans. Compared with its 2-D counterparts, the 3-D CNNs can encode richer spatial information and extract more representative features via their hierarchical architecture trained with 3-D samples. More importantly, we further propose a simple yet effective strategy to encode multilevel contextual information to meet the challenges coming with the large variations and hard mimics of pulmonary nodules. The proposed framework has been extensively validated in the LUNA16 challenge held in conjunction with ISBI 2016, where we achieved the highest competition performance metric (CPM) score in the false positive reduction track. Experimental results demonstrated the importance and effectiveness of integrating multilevel contextual information into 3-D CNN framework for automated pulmonary nodule detection in volumetric CT data. While our method is tailored for pulmonary nodule detection, the proposed framework is general and can be easily extended to many other 3-D object detection tasks from volumetric medical images, where the targeting objects have large variations and are accompanied by a number of hard mimics.
Manavella, Valeria; Romano, Federica; Garrone, Federica; Terzini, Mara; Bignardi, Cristina; Aimetti, Mario
2017-06-01
The aim of this study was to present and validate a novel procedure for the quantitative volumetric assessment of extraction sockets that combines cone-beam computed tomography (CBCT) and image processing techniques. The CBCT dataset of 9 severely resorbed extraction sockets was analyzed by means of two image processing software, Image J and Mimics, using manual and automated segmentation techniques. They were also applied on 5-mm spherical aluminum markers of known volume and on a polyvinyl chloride model of one alveolar socket scanned with Micro-CT to test the accuracy. Statistical differences in alveolar socket volume were found between the different methods of volumetric analysis (P<0.0001). The automated segmentation using Mimics was the most reliable and accurate method with a relative error of 1.5%, considerably smaller than the error of 7% and of 10% introduced by the manual method using Mimics and by the automated method using ImageJ. The currently proposed automated segmentation protocol for the three-dimensional rendering of alveolar sockets showed more accurate results, excellent inter-observer similarity and increased user friendliness. The clinical application of this method enables a three-dimensional evaluation of extraction socket healing after the reconstructive procedures and during the follow-up visits.
Transforming microbial genotyping: a robotic pipeline for genotyping bacterial strains.
O'Farrell, Brian; Haase, Jana K; Velayudhan, Vimalkumar; Murphy, Ronan A; Achtman, Mark
2012-01-01
Microbial genotyping increasingly deals with large numbers of samples, and data are commonly evaluated by unstructured approaches, such as spread-sheets. The efficiency, reliability and throughput of genotyping would benefit from the automation of manual manipulations within the context of sophisticated data storage. We developed a medium- throughput genotyping pipeline for MultiLocus Sequence Typing (MLST) of bacterial pathogens. This pipeline was implemented through a combination of four automated liquid handling systems, a Laboratory Information Management System (LIMS) consisting of a variety of dedicated commercial operating systems and programs, including a Sample Management System, plus numerous Python scripts. All tubes and microwell racks were bar-coded and their locations and status were recorded in the LIMS. We also created a hierarchical set of items that could be used to represent bacterial species, their products and experiments. The LIMS allowed reliable, semi-automated, traceable bacterial genotyping from initial single colony isolation and sub-cultivation through DNA extraction and normalization to PCRs, sequencing and MLST sequence trace evaluation. We also describe robotic sequencing to facilitate cherrypicking of sequence dropouts. This pipeline is user-friendly, with a throughput of 96 strains within 10 working days at a total cost of < €25 per strain. Since developing this pipeline, >200,000 items were processed by two to three people. Our sophisticated automated pipeline can be implemented by a small microbiology group without extensive external support, and provides a general framework for semi-automated bacterial genotyping of large numbers of samples at low cost.
Transforming Microbial Genotyping: A Robotic Pipeline for Genotyping Bacterial Strains
Velayudhan, Vimalkumar; Murphy, Ronan A.; Achtman, Mark
2012-01-01
Microbial genotyping increasingly deals with large numbers of samples, and data are commonly evaluated by unstructured approaches, such as spread-sheets. The efficiency, reliability and throughput of genotyping would benefit from the automation of manual manipulations within the context of sophisticated data storage. We developed a medium- throughput genotyping pipeline for MultiLocus Sequence Typing (MLST) of bacterial pathogens. This pipeline was implemented through a combination of four automated liquid handling systems, a Laboratory Information Management System (LIMS) consisting of a variety of dedicated commercial operating systems and programs, including a Sample Management System, plus numerous Python scripts. All tubes and microwell racks were bar-coded and their locations and status were recorded in the LIMS. We also created a hierarchical set of items that could be used to represent bacterial species, their products and experiments. The LIMS allowed reliable, semi-automated, traceable bacterial genotyping from initial single colony isolation and sub-cultivation through DNA extraction and normalization to PCRs, sequencing and MLST sequence trace evaluation. We also describe robotic sequencing to facilitate cherrypicking of sequence dropouts. This pipeline is user-friendly, with a throughput of 96 strains within 10 working days at a total cost of < €25 per strain. Since developing this pipeline, >200,000 items were processed by two to three people. Our sophisticated automated pipeline can be implemented by a small microbiology group without extensive external support, and provides a general framework for semi-automated bacterial genotyping of large numbers of samples at low cost. PMID:23144721
[Medical imaging in tumor precision medicine: opportunities and challenges].
Xu, Jingjing; Tan, Yanbin; Zhang, Minming
2017-05-25
Tumor precision medicine is an emerging approach for tumor diagnosis, treatment and prevention, which takes account of individual variability of environment, lifestyle and genetic information. Tumor precision medicine is built up on the medical imaging innovations developed during the past decades, including the new hardware, new imaging agents, standardized protocols, image analysis and multimodal imaging fusion technology. Also the development of automated and reproducible analysis algorithm has extracted large amount of information from image-based features. With the continuous development and mining of tumor clinical and imaging databases, the radiogenomics, radiomics and artificial intelligence have been flourishing. Therefore, these new technological advances bring new opportunities and challenges to the application of imaging in tumor precision medicine.
Video Analytics for Indexing, Summarization and Searching of Video Archives
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trease, Harold E.; Trease, Lynn L.
This paper will be submitted to the proceedings The Eleventh IASTED International Conference on. Signal and Image Processing. Given a video or video archive how does one effectively and quickly summarize, classify, and search the information contained within the data? This paper addresses these issues by describing a process for the automated generation of a table-of-contents and keyword, topic-based index tables that can be used to catalogue, summarize, and search large amounts of video data. Having the ability to index and search the information contained within the videos, beyond just metadata tags, provides a mechanism to extract and identify "useful"more » content from image and video data.« less
A biomedical information system for retrieval and manipulation of NHANES data.
Mukherjee, Sukrit; Martins, David; Norris, Keith C; Jenders, Robert A
2013-01-01
The retrieval and manipulation of data from large public databases like the U.S. National Health and Nutrition Examination Survey (NHANES) may require sophisticated statistical software and significant expertise that may be unavailable in the university setting. In response, we have developed the Data Retrieval And Manipulation System (DReAMS), an automated information system to handle all processes of data extraction and cleaning and then joining different subsets to produce analysis-ready output. The system is a browser-based data warehouse application in which the input data from flat files or operational systems are aggregated in a structured way so that the desired data can be read, recoded, queried and extracted efficiently. The current pilot implementation of the system provides access to a limited amount of NHANES database. We plan to increase the amount of data available through the system in the near future and to extend the techniques to other large databases from CDU archive with a current holding of about 53 databases.
Classification of product inspection items using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, H.-W.
1998-03-01
Automated processing and classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. This approach involves two main steps: preprocessing and classification. Preprocessing locates individual items and segments ones that touch using a modified watershed algorithm. The second stage involves extraction of features that allow discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper. We use a new nonlinear feature extraction scheme called the maximum representation and discriminating feature (MRDF) extraction method to compute nonlinear features that are used as inputs to a classifier. The MRDF is shown to provide better classification and a better ROC (receiver operating characteristic) curve than other methods.
Automated extraction of single H atoms with STM: tip state dependency
NASA Astrophysics Data System (ADS)
Møller, Morten; Jarvis, Samuel P.; Guérinet, Laurent; Sharp, Peter; Woolley, Richard; Rahe, Philipp; Moriarty, Philip
2017-02-01
The atomistic structure of the tip apex plays a crucial role in performing reliable atomic-scale surface and adsorbate manipulation using scanning probe techniques. We have developed an automated extraction routine for controlled removal of single hydrogen atoms from the H:Si(100) surface. The set of atomic extraction protocols detect a variety of desorption events during scanning tunneling microscope (STM)-induced modification of the hydrogen-passivated surface. The influence of the tip state on the probability for hydrogen removal was examined by comparing the desorption efficiency for various classifications of STM topographs (rows, dimers, atoms, etc). We find that dimer-row-resolving tip apices extract hydrogen atoms most readily and reliably (and with least spurious desorption), while tip states which provide atomic resolution counter-intuitively have a lower probability for single H atom removal.
Lindley, C.E.; Stewart, J.T.; Sandstrom, M.W.
1996-01-01
A sensitive and reliable gas chromatographic/mass spectrometric (GC/MS) method for determining acetochlor in environmental water samples was developed. The method involves automated extraction of the herbicide from a filtered 1 L water sample through a C18 solid-phase extraction column, elution from the column with hexane-isopropyl alcohol (3 + 1), and concentration of the extract with nitrogen gas. The herbicide is quantitated by capillary/column GC/MS with selected-ion monitoring of 3 characteristic ions. The single-operator method detection limit for reagent water samples is 0.0015 ??g/L. Mean recoveries ranged from about 92 to 115% for 3 water matrixes fortified at 0.05 and 0.5 ??g/L. Average single-operator precision, over the course of 1 week, was better than 5%.
GenePublisher: Automated analysis of DNA microarray data.
Knudsen, Steen; Workman, Christopher; Sicheritz-Ponten, Thomas; Friis, Carsten
2003-07-01
GenePublisher, a system for automatic analysis of data from DNA microarray experiments, has been implemented with a web interface at http://www.cbs.dtu.dk/services/GenePublisher. Raw data are uploaded to the server together with a specification of the data. The server performs normalization, statistical analysis and visualization of the data. The results are run against databases of signal transduction pathways, metabolic pathways and promoter sequences in order to extract more information. The results of the entire analysis are summarized in report form and returned to the user.
Translation lexicon acquisition from bilingual dictionaries
NASA Astrophysics Data System (ADS)
Doermann, David S.; Ma, Huanfeng; Karagol-Ayan, Burcu; Oard, Douglas W.
2001-12-01
Bilingual dictionaries hold great potential as a source of lexical resources for training automated systems for optical character recognition, machine translation and cross-language information retrieval. In this work we describe a system for extracting term lexicons from printed copies of bilingual dictionaries. We describe our approach to page and definition segmentation and entry parsing. We have used the approach to parse a number of dictionaries and demonstrate the results for retrieval using a French-English Dictionary to generate a translation lexicon and a corpus of English queries applied to French documents to evaluation cross-language IR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
The four-dimensional scattering function S(Q,w) obtained by inelastic neutron scattering measurements provides unique "dynamical fingerprints" of the spin state and interactions present in complex magnetic materials. Extracting this information however is currently a slow and complex process that may take an expert -depending on the complexity of the system- up to several weeks of painstaking work to complete. Spin Wave Genie was created to abstract and automate this process. It strives to both reduce the time to complete this analysis and make these calculations more accessible to a broader group of scientists and engineers.
Adaptive automation of human-machine system information-processing functions.
Kaber, David B; Wright, Melanie C; Prinzel, Lawrence J; Clamann, Michael P
2005-01-01
The goal of this research was to describe the ability of human operators to interact with adaptive automation (AA) applied to various stages of complex systems information processing, defined in a model of human-automation interaction. Forty participants operated a simulation of an air traffic control task. Automated assistance was adaptively applied to information acquisition, information analysis, decision making, and action implementation aspects of the task based on operator workload states, which were measured using a secondary task. The differential effects of the forms of automation were determined and compared with a manual control condition. Results of two 20-min trials of AA or manual control revealed a significant effect of the type of automation on performance, particularly during manual control periods as part of the adaptive conditions. Humans appear to better adapt to AA applied to sensory and psychomotor information-processing functions (action implementation) than to AA applied to cognitive functions (information analysis and decision making), and AA is superior to completely manual control. Potential applications of this research include the design of automation to support air traffic controller information processing.
Roi Detection and Vessel Segmentation in Retinal Image
NASA Astrophysics Data System (ADS)
Sabaz, F.; Atila, U.
2017-11-01
Diabetes disrupts work by affecting the structure of the eye and afterwards leads to loss of vision. Depending on the stage of disease that called diabetic retinopathy, there are sudden loss of vision and blurred vision problems. Automated detection of vessels in retinal images is a useful study to diagnose eye diseases, disease classification and other clinical trials. The shape and structure of the vessels give information about the severity of the disease and the stage of the disease. Automatic and fast detection of vessels allows for a quick diagnosis of the disease and the treatment process to start shortly. ROI detection and vessel extraction methods for retinal image are mentioned in this study. It is shown that the Frangi filter used in image processing can be successfully used in detection and extraction of vessels.
Fast fringe pattern phase demodulation using FIR Hilbert transformers
NASA Astrophysics Data System (ADS)
Gdeisat, Munther; Burton, David; Lilley, Francis; Arevalillo-Herráez, Miguel
2016-01-01
This paper suggests the use of FIR Hilbert transformers to extract the phase of fringe patterns. This method is computationally faster than any known spatial method that produces wrapped phase maps. Also, the algorithm does not require any parameters to be adjusted which are dependent upon the specific fringe pattern that is being processed, or upon the particular setup of the optical fringe projection system that is being used. It is therefore particularly suitable for full algorithmic automation. The accuracy and validity of the suggested method has been tested using both computer-generated and real fringe patterns. This novel algorithm has been proposed for its advantages in terms of computational processing speed as it is the fastest available method to extract the wrapped phase information from a fringe pattern.
Topaz, Maxim; Lai, Kenneth; Dowding, Dawn; Lei, Victor J; Zisberg, Anna; Bowles, Kathryn H; Zhou, Li
2016-12-01
Electronic health records are being increasingly used by nurses with up to 80% of the health data recorded as free text. However, only a few studies have developed nursing-relevant tools that help busy clinicians to identify information they need at the point of care. This study developed and validated one of the first automated natural language processing applications to extract wound information (wound type, pressure ulcer stage, wound size, anatomic location, and wound treatment) from free text clinical notes. First, two human annotators manually reviewed a purposeful training sample (n=360) and random test sample (n=1100) of clinical notes (including 50% discharge summaries and 50% outpatient notes), identified wound cases, and created a gold standard dataset. We then trained and tested our natural language processing system (known as MTERMS) to process the wound information. Finally, we assessed our automated approach by comparing system-generated findings against the gold standard. We also compared the prevalence of wound cases identified from free-text data with coded diagnoses in the structured data. The testing dataset included 101 notes (9.2%) with wound information. The overall system performance was good (F-measure is a compiled measure of system's accuracy=92.7%), with best results for wound treatment (F-measure=95.7%) and poorest results for wound size (F-measure=81.9%). Only 46.5% of wound notes had a structured code for a wound diagnosis. The natural language processing system achieved good performance on a subset of randomly selected discharge summaries and outpatient notes. In more than half of the wound notes, there were no coded wound diagnoses, which highlight the significance of using natural language processing to enrich clinical decision making. Our future steps will include expansion of the application's information coverage to other relevant wound factors and validation of the model with external data. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ramírez Fernández, María del Mar; Van Durme, Filip; Wille, Sarah M R; di Fazio, Vincent; Kummer, Natalie; Samyn, Nele
2014-06-01
The aim of this work was to automate a sample preparation procedure extracting morphine, hydromorphone, oxymorphone, norcodeine, codeine, dihydrocodeine, oxycodone, 6-monoacetyl-morphine, hydrocodone, ethylmorphine, benzoylecgonine, cocaine, cocaethylene, tramadol, meperidine, pentazocine, fentanyl, norfentanyl, buprenorphine, norbuprenorphine, propoxyphene, methadone and 2-ethylidene-1,5-dimethyl-3,3-diphenylpyrrolidine from urine samples. Samples were extracted by solid-phase extraction (SPE) with cation exchange cartridges using a TECAN Freedom Evo 100 base robotic system, including a hydrolysis step previous extraction when required. Block modules were carefully selected in order to use the same consumable material as in manual procedures to reduce cost and/or manual sample transfers. Moreover, the present configuration included pressure monitoring pipetting increasing pipetting accuracy and detecting sampling errors. The compounds were then separated in a chromatographic run of 9 min using a BEH Phenyl analytical column on a ultra-performance liquid chromatography-tandem mass spectrometry system. Optimization of the SPE was performed with different wash conditions and elution solvents. Intra- and inter-day relative standard deviations (RSDs) were within ±15% and bias was within ±15% for most of the compounds. Recovery was >69% (RSD < 11%) and matrix effects ranged from 1 to 26% when compensated with the internal standard. The limits of quantification ranged from 3 to 25 ng/mL depending on the compound. No cross-contamination in the automated SPE system was observed. The extracted samples were stable for 72 h in the autosampler (4°C). This method was applied to authentic samples (from forensic and toxicology cases) and to proficiency testing schemes containing cocaine, heroin, buprenorphine and methadone, offering fast and reliable results. Automation resulted in improved precision and accuracy, and a minimum operator intervention, leading to safer sample handling and less time-consuming procedures.
Creation of a virtual cutaneous tissue bank
NASA Astrophysics Data System (ADS)
LaFramboise, William A.; Shah, Sujal; Hoy, R. W.; Letbetter, D.; Petrosko, P.; Vennare, R.; Johnson, Peter C.
2000-04-01
Cellular and non-cellular constituents of skin contain fundamental morphometric features and structural patterns that correlate with tissue function. High resolution digital image acquisitions performed using an automated system and proprietary software to assemble adjacent images and create a contiguous, lossless, digital representation of individual microscope slide specimens. Serial extraction, evaluation and statistical analysis of cutaneous feature is performed utilizing an automated analysis system, to derive normal cutaneous parameters comprising essential structural skin components. Automated digital cutaneous analysis allows for fast extraction of microanatomic dat with accuracy approximating manual measurement. The process provides rapid assessment of feature both within individual specimens and across sample populations. The images, component data, and statistical analysis comprise a bioinformatics database to serve as an architectural blueprint for skin tissue engineering and as a diagnostic standard of comparison for pathologic specimens.
Automated labeling of bibliographic data extracted from biomedical online journals
NASA Astrophysics Data System (ADS)
Kim, Jongwoo; Le, Daniel X.; Thoma, George R.
2003-01-01
A prototype system has been designed to automate the extraction of bibliographic data (e.g., article title, authors, abstract, affiliation and others) from online biomedical journals to populate the National Library of Medicine"s MEDLINE database. This paper describes a key module in this system: the labeling module that employs statistics and fuzzy rule-based algorithms to identify segmented zones in an article"s HTML pages as specific bibliographic data. Results from experiments conducted with 1,149 medical articles from forty-seven journal issues are presented.
Kim, Brian J; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil A; Contreras, Richard; Jacobsen, Steven J; Chien, Gary W
2014-12-01
Natural language processing (NLP) software programs have been widely developed to transform complex free text into simplified organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included the TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a gold standard compiled by two blinded manual reviewers for 100 random pathology reports. NLP demonstrated 100% accuracy for identifying the Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report. This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases.
Yamaguchi, Akemi; Matsuda, Kazuyuki; Uehara, Masayuki; Honda, Takayuki; Saito, Yasunori
2016-02-04
We report a novel automated device for nucleic acid extraction, which consists of a mechanical control system and a disposable cassette. The cassette is composed of a bottle, a capillary tube, and a chamber. After sample injection in the bottle, the sample is lysed, and nucleic acids are adsorbed on the surface of magnetic silica beads. These magnetic beads are transported and are vibrated through the washing reagents in the capillary tube under the control of the mechanical control system, and thus, the nucleic acid is purified without centrifugation. The purified nucleic acid is automatically extracted in 3 min for the polymerase chain reaction (PCR). The nucleic acid extraction is dependent on the transport speed and the vibration frequency of the magnetic beads, and optimizing these two parameters provided better PCR efficiency than the conventional manual procedure. There was no difference between the detection limits of our novel device and that of the conventional manual procedure. We have already developed the droplet-PCR machine, which can amplify and detect specific nucleic acids rapidly and automatically. Connecting the droplet-PCR machine to our novel automated extraction device enables PCR analysis within 15 min, and this system can be made available as a point-of-care testing in clinics as well as general hospitals. Copyright © 2015 Elsevier B.V. All rights reserved.
O'Connor, Annette M; Totton, Sarah C; Cullen, Jonah N; Ramezani, Mahmood; Kalivarapu, Vijay; Yuan, Chaohui; Gilbert, Stephen B
2018-01-01
Systematic reviews are increasingly using data from preclinical animal experiments in evidence networks. Further, there are ever-increasing efforts to automate aspects of the systematic review process. When assessing systematic bias and unit-of-analysis errors in preclinical experiments, it is critical to understand the study design elements employed by investigators. Such information can also inform prioritization of automation efforts that allow the identification of the most common issues. The aim of this study was to identify the design elements used by investigators in preclinical research in order to inform unique aspects of assessment of bias and error in preclinical research. Using 100 preclinical experiments each related to brain trauma and toxicology, we assessed design elements described by the investigators. We evaluated Methods and Materials sections of reports for descriptions of the following design elements: 1) use of comparison group, 2) unit of allocation of the interventions to study units, 3) arrangement of factors, 4) method of factor allocation to study units, 5) concealment of the factors during allocation and outcome assessment, 6) independence of study units, and 7) nature of factors. Many investigators reported using design elements that suggested the potential for unit-of-analysis errors, i.e., descriptions of repeated measurements of the outcome (94/200) and descriptions of potential for pseudo-replication (99/200). Use of complex factor arrangements was common, with 112 experiments using some form of factorial design (complete, incomplete or split-plot-like). In the toxicology dataset, 20 of the 100 experiments appeared to use a split-plot-like design, although no investigators used this term. The common use of repeated measures and factorial designs means understanding bias and error in preclinical experimental design might require greater expertise than simple parallel designs. Similarly, use of complex factor arrangements creates novel challenges for accurate automation of data extraction and bias and error assessment in preclinical experiments.
Single-trial event-related potential extraction through one-unit ICA-with-reference
NASA Astrophysics Data System (ADS)
Lih Lee, Wee; Tan, Tele; Falkmer, Torbjörn; Leung, Yee Hong
2016-12-01
Objective. In recent years, ICA has been one of the more popular methods for extracting event-related potential (ERP) at the single-trial level. It is a blind source separation technique that allows the extraction of an ERP without making strong assumptions on the temporal and spatial characteristics of an ERP. However, the problem with traditional ICA is that the extraction is not direct and is time-consuming due to the need for source selection processing. In this paper, the application of an one-unit ICA-with-Reference (ICA-R), a constrained ICA method, is proposed. Approach. In cases where the time-region of the desired ERP is known a priori, this time information is utilized to generate a reference signal, which is then used for guiding the one-unit ICA-R to extract the source signal of the desired ERP directly. Main results. Our results showed that, as compared to traditional ICA, ICA-R is a more effective method for analysing ERP because it avoids manual source selection and it requires less computation thus resulting in faster ERP extraction. Significance. In addition to that, since the method is automated, it reduces the risks of any subjective bias in the ERP analysis. It is also a potential tool for extracting the ERP in online application.
Single-trial event-related potential extraction through one-unit ICA-with-reference.
Lee, Wee Lih; Tan, Tele; Falkmer, Torbjörn; Leung, Yee Hong
2016-12-01
In recent years, ICA has been one of the more popular methods for extracting event-related potential (ERP) at the single-trial level. It is a blind source separation technique that allows the extraction of an ERP without making strong assumptions on the temporal and spatial characteristics of an ERP. However, the problem with traditional ICA is that the extraction is not direct and is time-consuming due to the need for source selection processing. In this paper, the application of an one-unit ICA-with-Reference (ICA-R), a constrained ICA method, is proposed. In cases where the time-region of the desired ERP is known a priori, this time information is utilized to generate a reference signal, which is then used for guiding the one-unit ICA-R to extract the source signal of the desired ERP directly. Our results showed that, as compared to traditional ICA, ICA-R is a more effective method for analysing ERP because it avoids manual source selection and it requires less computation thus resulting in faster ERP extraction. In addition to that, since the method is automated, it reduces the risks of any subjective bias in the ERP analysis. It is also a potential tool for extracting the ERP in online application.
Pathak, Jyotishman; Bailey, Kent R; Beebe, Calvin E; Bethard, Steven; Carrell, David S; Chen, Pei J; Dligach, Dmitriy; Endle, Cory M; Hart, Lacey A; Haug, Peter J; Huff, Stanley M; Kaggal, Vinod C; Li, Dingcheng; Liu, Hongfang; Marchant, Kyle; Masanz, James; Miller, Timothy; Oniki, Thomas A; Palmer, Martha; Peterson, Kevin J; Rea, Susan; Savova, Guergana K; Stancl, Craig R; Sohn, Sunghwan; Solbrig, Harold R; Suesse, Dale B; Tao, Cui; Taylor, David P; Westberg, Les; Wu, Stephen; Zhuo, Ning; Chute, Christopher G
2013-01-01
Research objective To develop scalable informatics infrastructure for normalization of both structured and unstructured electronic health record (EHR) data into a unified, concept-based model for high-throughput phenotype extraction. Materials and methods Software tools and applications were developed to extract information from EHRs. Representative and convenience samples of both structured and unstructured data from two EHR systems—Mayo Clinic and Intermountain Healthcare—were used for development and validation. Extracted information was standardized and normalized to meaningful use (MU) conformant terminology and value set standards using Clinical Element Models (CEMs). These resources were used to demonstrate semi-automatic execution of MU clinical-quality measures modeled using the Quality Data Model (QDM) and an open-source rules engine. Results Using CEMs and open-source natural language processing and terminology services engines—namely, Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) and Common Terminology Services (CTS2)—we developed a data-normalization platform that ensures data security, end-to-end connectivity, and reliable data flow within and across institutions. We demonstrated the applicability of this platform by executing a QDM-based MU quality measure that determines the percentage of patients between 18 and 75 years with diabetes whose most recent low-density lipoprotein cholesterol test result during the measurement year was <100 mg/dL on a randomly selected cohort of 273 Mayo Clinic patients. The platform identified 21 and 18 patients for the denominator and numerator of the quality measure, respectively. Validation results indicate that all identified patients meet the QDM-based criteria. Conclusions End-to-end automated systems for extracting clinical information from diverse EHR systems require extensive use of standardized vocabularies and terminologies, as well as robust information models for storing, discovering, and processing that information. This study demonstrates the application of modular and open-source resources for enabling secondary use of EHR data through normalization into standards-based, comparable, and consistent format for high-throughput phenotyping to identify patient cohorts. PMID:24190931
Automating Network Node Behavior Characterization by Mining Communication Patterns
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, Thomas E.; Chikkagoudar, Satish; Arthur-Durett, Kristine M.
Enterprise networks of scale are complex, dynamic computing environments that respond to evolv- ing business objectives and requirements. Characteriz- ing system behaviors in these environments is essential for network management and cyber security operations. Characterization of system’s communication is typical and is supported using network flow information (NetFlow). Related work has characterized behavior using theoretical graph metrics; results are often difficult to interpret by enterprise staff. We propose a different approach, where flow information is mapped to sets of tags that contextualize the data in terms of network principals and enterprise concepts. Frequent patterns are then extracted and are expressedmore » as behaviors. Behaviors can be com- pared, identifying systems expressing similar behaviors. We evaluate the approach using flow information collected by a third party.« less
Automation and decision support in interactive consumer products.
Sauer, J; Rüttinger, B
2007-06-01
This article presents two empirical studies (n = 30, n = 48) that are concerned with different forms of automation in interactive consumer products. The goal of the studies was to evaluate the effectiveness of two types of automation: perceptual augmentation (i.e. supporting users' information acquisition and analysis); and control integration (i.e. supporting users' action selection and implementation). Furthermore, the effectiveness of on-product information (i.e. labels attached to product) in supporting automation design was evaluated. The findings suggested greater benefits for automation in control integration than in perceptual augmentation alone, which may be partly due to the specific requirements of consumer product usage. If employed appropriately, on-product information can be a helpful means of information conveyance. The article discusses the implications of automation design in interactive consumer products while drawing on automation models from the work environment.
DB4US: A Decision Support System for Laboratory Information Management.
Carmona-Cejudo, José M; Hortas, Maria Luisa; Baena-García, Manuel; Lana-Linati, Jorge; González, Carlos; Redondo, Maximino; Morales-Bueno, Rafael
2012-11-14
Until recently, laboratory automation has focused primarily on improving hardware. Future advances are concentrated on intelligent software since laboratories performing clinical diagnostic testing require improved information systems to address their data processing needs. In this paper, we propose DB4US, an application that automates information related to laboratory quality indicators information. Currently, there is a lack of ready-to-use management quality measures. This application addresses this deficiency through the extraction, consolidation, statistical analysis, and visualization of data related to the use of demographics, reagents, and turn-around times. The design and implementation issues, as well as the technologies used for the implementation of this system, are discussed in this paper. To develop a general methodology that integrates the computation of ready-to-use management quality measures and a dashboard to easily analyze the overall performance of a laboratory, as well as automatically detect anomalies or errors. The novelty of our approach lies in the application of integrated web-based dashboards as an information management system in hospital laboratories. We propose a new methodology for laboratory information management based on the extraction, consolidation, statistical analysis, and visualization of data related to demographics, reagents, and turn-around times, offering a dashboard-like user web interface to the laboratory manager. The methodology comprises a unified data warehouse that stores and consolidates multidimensional data from different data sources. The methodology is illustrated through the implementation and validation of DB4US, a novel web application based on this methodology that constructs an interface to obtain ready-to-use indicators, and offers the possibility to drill down from high-level metrics to more detailed summaries. The offered indicators are calculated beforehand so that they are ready to use when the user needs them. The design is based on a set of different parallel processes to precalculate indicators. The application displays information related to tests, requests, samples, and turn-around times. The dashboard is designed to show the set of indicators on a single screen. DB4US was deployed for the first time in the Hospital Costa del Sol in 2008. In our evaluation we show the positive impact of this methodology for laboratory professionals, since the use of our application has reduced the time needed for the elaboration of the different statistical indicators and has also provided information that has been used to optimize the usage of laboratory resources by the discovery of anomalies in the indicators. DB4US users benefit from Internet-based communication of results, since this information is available from any computer without having to install any additional software. The proposed methodology and the accompanying web application, DB4US, automates the processing of information related to laboratory quality indicators and offers a novel approach for managing laboratory-related information, benefiting from an Internet-based communication mechanism. The application of this methodology has been shown to improve the usage of time, as well as other laboratory resources.
Deriving novel relationships from the scientific literature is an important adjunct to datamining activities for complex datasets in genomics and high-throughput screening activities. Automated text-mining algorithms can be used to extract relevant content from the literature and...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Covington, E; Younge, K; Chen, X
Purpose: To evaluate the effectiveness of an automated plan check tool to improve first-time plan quality as well as standardize and document performance of physics plan checks. Methods: The Plan Checker Tool (PCT) uses the Eclipse Scripting API to check and compare data from the treatment planning system (TPS) and treatment management system (TMS). PCT was created to improve first-time plan quality, reduce patient delays, increase efficiency of our electronic workflow, and to standardize and partially automate plan checks in the TPS. A framework was developed which can be configured with different reference values and types of checks. One examplemore » is the prescribed dose check where PCT flags the user when the planned dose and the prescribed dose disagree. PCT includes a comprehensive checklist of automated and manual checks that are documented when performed by the user. A PDF report is created and automatically uploaded into the TMS. Prior to and during PCT development, errors caught during plan checks and also patient delays were tracked in order to prioritize which checks should be automated. The most common and significant errors were determined. Results: Nineteen of 33 checklist items were automated with data extracted with the PCT. These include checks for prescription, reference point and machine scheduling errors which are three of the top six causes of patient delays related to physics and dosimetry. Since the clinical roll-out, no delays have been due to errors that are automatically flagged by the PCT. Development continues to automate the remaining checks. Conclusion: With PCT, 57% of the physics plan checklist has been partially or fully automated. Treatment delays have declined since release of the PCT for clinical use. By tracking delays and errors, we have been able to measure the effectiveness of automating checks and are using this information to prioritize future development. This project was supported in part by P01CA059827.« less
The development of a strategy for the implementation of automation in a bioanalytical laboratory.
Mole, D; Mason, R J; McDowall, R D
1993-03-01
Laboratory automation is equipment, instrumentation, software and techniques that are classified into four groups: instrument automation; communications; data to information conversion; and information management. This new definition is necessary to understand the role that automation can play in achieving the aims and objectives of a laboratory within its organization. To undertake automation projects effectively, a laboratory automation strategy is outlined which requires an intimate knowledge of an organization and the target environment to implement individual automation projects.
Orbital transfer vehicle launch operations study: Automated technology knowledge base, volume 4
NASA Technical Reports Server (NTRS)
1986-01-01
A simplified retrieval strategy for compiling automation-related bibliographies from NASA/RECON is presented. Two subsets of NASA Thesaurus subject terms were extracted: a primary list, which is used to obtain an initial set of citations; and a secondary list, which is used to limit or further specify a large initial set of citations. These subject term lists are presented in Appendix A as the Automated Technology Knowledge Base (ATKB) Thesaurus.
Novas, Romulo Bourget; Fazan, Valeria Paula Sassoli; Felipe, Joaquim Cezar
2016-02-01
Nerve morphometry is known to produce relevant information for the evaluation of several phenomena, such as nerve repair, regeneration, implant, transplant, aging, and different human neuropathies. Manual morphometry is laborious, tedious, time consuming, and subject to many sources of error. Therefore, in this paper, we propose a new method for the automated morphometry of myelinated fibers in cross-section light microscopy images. Images from the recurrent laryngeal nerve of adult rats and the vestibulocochlear nerve of adult guinea pigs were used herein. The proposed pipeline for fiber segmentation is based on the techniques of competitive clustering and concavity analysis. The evaluation of the proposed method for segmentation of images was done by comparing the automatic segmentation with the manual segmentation. To further evaluate the proposed method considering morphometric features extracted from the segmented images, the distributions of these features were tested for statistical significant difference. The method achieved a high overall sensitivity and very low false-positive rates per image. We detect no statistical difference between the distribution of the features extracted from the manual and the pipeline segmentations. The method presented a good overall performance, showing widespread potential in experimental and clinical settings allowing large-scale image analysis and, thus, leading to more reliable results.
Evaluation Of A Powder-Free DNA Extraction Method For Skeletal Remains.
Harrel, Michelle; Mayes, Carrie; Gangitano, David; Hughes-Stamm, Sheree
2018-02-07
Bones are often recovered in forensic investigations, including missing persons and mass disasters. While traditional DNA extraction methods rely on grinding bone into powder prior to DNA purification, the TBone Ex buffer (DNA Chip Research Inc.) digests bone chips without powdering. In this study, six bones were extracted using the TBone Ex kit in conjunction with the PrepFiler ® BTA™ DNA extraction kit (Thermo Fisher Scientific) both manually and via an automated platform. Comparable amounts of DNA were recovered from a 50 mg bone chip using the TBone Ex kit and 50 mg of powdered bone with the PrepFiler ® BTA™ kit. However, automated DNA purification decreased DNA yield (p < 0.05). Nevertheless, short tandem repeat (STR) success was comparable across all methods tested. This study demonstrates that digestion of whole bone fragments is an efficient alternative to powdering bones for DNA extraction without compromising downstream STR profile quality. © 2018 American Academy of Forensic Sciences.
On Feature Extraction from Large Scale Linear LiDAR Data
NASA Astrophysics Data System (ADS)
Acharjee, Partha Pratim
Airborne light detection and ranging (LiDAR) can generate co-registered elevation and intensity map over large terrain. The co-registered 3D map and intensity information can be used efficiently for different feature extraction application. In this dissertation, we developed two algorithms for feature extraction, and usages of features for practical applications. One of the developed algorithms can map still and flowing waterbody features, and another one can extract building feature and estimate solar potential on rooftops and facades. Remote sensing capabilities, distinguishing characteristics of laser returns from water surface and specific data collection procedures provide LiDAR data an edge in this application domain. Furthermore, water surface mapping solutions must work on extremely large datasets, from a thousand square miles, to hundreds of thousands of square miles. National and state-wide map generation/upgradation and hydro-flattening of LiDAR data for many other applications are two leading needs of water surface mapping. These call for as much automation as possible. Researchers have developed many semi-automated algorithms using multiple semi-automated tools and human interventions. This reported work describes a consolidated algorithm and toolbox developed for large scale, automated water surface mapping. Geometric features such as flatness of water surface, higher elevation change in water-land interface and, optical properties such as dropouts caused by specular reflection, bimodal intensity distributions were some of the linear LiDAR features exploited for water surface mapping. Large-scale data handling capabilities are incorporated by automated and intelligent windowing, by resolving boundary issues and integrating all results to a single output. This whole algorithm is developed as an ArcGIS toolbox using Python libraries. Testing and validation are performed on a large datasets to determine the effectiveness of the toolbox and results are presented. Significant power demand is located in urban areas, where, theoretically, a large amount of building surface area is also available for solar panel installation. Therefore, property owners and power generation companies can benefit from a citywide solar potential map, which can provide available estimated annual solar energy at a given location. An efficient solar potential measurement is a prerequisite for an effective solar energy system in an urban area. In addition, the solar potential calculation from rooftops and building facades could open up a wide variety of options for solar panel installations. However, complex urban scenes make it hard to estimate the solar potential, partly because of shadows cast by the buildings. LiDAR-based 3D city models could possibly be the right technology for solar potential mapping. Although, most of the current LiDAR-based local solar potential assessment algorithms mainly address rooftop potential calculation, whereas building facades can contribute a significant amount of viable surface area for solar panel installation. In this paper, we introduce a new algorithm to calculate solar potential of both rooftop and building facades. Solar potential received by the rooftops and facades over the year are also investigated in the test area.
Automated video-based assessment of surgical skills for training and evaluation in medical schools.
Zia, Aneeq; Sharma, Yachna; Bettadapura, Vinay; Sarin, Eric L; Ploetz, Thomas; Clements, Mark A; Essa, Irfan
2016-09-01
Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches however are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities. We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos. Our evaluations show that frequency features perform better than motion texture features, which in-turn perform better than symbol-/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.
Harper, Kathryn A; Meiklejohn, Kelly A; Merritt, Richard T; Walker, Jessica; Fisher, Constance L; Robertson, James M
2018-02-01
Hairs are commonly submitted as evidence to forensic laboratories, but standard nuclear DNA analysis is not always possible. Mitochondria (mt) provide another source of genetic material; however, manual isolation is laborious. In a proof-of-concept study, we assessed pressure cycling technology (PCT; an automated approach that subjects samples to varying cycles of high and low pressure) for extracting mtDNA from single, short hairs without roots. Using three microscopically similar donors, we determined the ideal PCT conditions and compared those yields to those obtained using the traditional manual micro-tissue grinder method. Higher yields were recovered from grinder extracts, but yields from PCT extracts exceeded the requirements for forensic analysis, with the DNA quality confirmed through sequencing. Automated extraction of mtDNA from hairs without roots using PCT could be useful for forensic laboratories processing numerous samples.
Evaluation of automated cell disruptor methods for oomycetous and ascomycetous model organisms
USDA-ARS?s Scientific Manuscript database
Two automated cell disruptor-based methods for RNA extraction; disruption of thawed cells submerged in TRIzol Reagent (method QP), and direct disruption of frozen cells on dry ice (method CP), were optimized for a model oomycete, Phytophthora capsici, and compared with grinding in a mortar and pestl...
An Automated Approach to Extracting River Bank Locations from Aerial Imagery Using Image Texture
2013-01-01
Atchafalaya River, LA. Map Data: Google, United States Department of Agriculture Farm Ser- vice Agency, Europa Technologies AUTOMATED RIVER BANK...traverse morphologically smooth landscapes including rivers in sand or ice . Within these limitations, we hold that this technique rep- resents a valuable
10 CFR 1017.28 - Processing on Automated Information Systems (AIS).
Code of Federal Regulations, 2010 CFR
2010-01-01
... 10 Energy 4 2010-01-01 2010-01-01 false Processing on Automated Information Systems (AIS). 1017.28... UNCLASSIFIED CONTROLLED NUCLEAR INFORMATION Physical Protection Requirements § 1017.28 Processing on Automated Information Systems (AIS). UCNI may be processed or produced on any AIS that complies with the guidance in OMB...
ERIC Educational Resources Information Center
Kiratsov, P.
1983-01-01
Discusses the design and organization of the Automated Information Centre, a centralized automated scientific and technical information service established within the main organ of Bulgaria's National System for Scientific and Technical Information, with UNESCO and United Nations Development Program assistance. Problems and perspectives for…
Validated Automatic Brain Extraction of Head CT Images
Muschelli, John; Ullman, Natalie L.; Mould, W. Andrew; Vespa, Paul; Hanley, Daniel F.; Crainiceanu, Ciprian M.
2015-01-01
Background X-ray Computed Tomography (CT) imaging of the brain is commonly used in diagnostic settings. Although CT scans are primarily used in clinical practice, they are increasingly used in research. A fundamental processing step in brain imaging research is brain extraction – the process of separating the brain tissue from all other tissues. Methods for brain extraction have either been 1) validated but not fully automated, or 2) fully automated and informally proposed, but never formally validated. Aim To systematically analyze and validate the performance of FSL's brain extraction tool (BET) on head CT images of patients with intracranial hemorrhage. This was done by comparing the manual gold standard with the results of several versions of automatic brain extraction and by estimating the reliability of automated segmentation of longitudinal scans. The effects of the choice of BET parameters and data smoothing is studied and reported. Methods All images were thresholded using a 0 – 100 Hounsfield units (HU) range. In one variant of the pipeline, data were smoothed using a 3-dimensional Gaussian kernel (σ = 1mm3) and re-thresholded to 0 – 100 HU; in the other, data were not smoothed. BET was applied using 1 of 3 fractional intensity (FI) thresholds: 0.01, 0.1, or 0.35 and any holes in the brain mask were filled. For validation against a manual segmentation, 36 images from patients with intracranial hemorrhage were selected from 19 different centers from the MISTIE (Minimally Invasive Surgery plus recombinant-tissue plasminogen activator for Intracerebral Evacuation) stroke trial. Intracranial masks of the brain were manually created by one expert CT reader. The resulting brain tissue masks were quantitatively compared to the manual segmentations using sensitivity, specificity, accuracy, and the Dice Similarity Index (DSI). Brain extraction performance across smoothing and FI thresholds was compared using the Wilcoxon signed-rank test. The intracranial volume (ICV) of each scan was estimated by multiplying the number of voxels in the brain mask by the dimensions of each voxel for that scan. From this, we calculated the ICV ratio comparing manual and automated segmentation: ICVautomatedICVmanual. To estimate the performance in a large number of scans, brain masks were generated from the 6 BET pipelines for 1095 longitudinal scans from 129 patients. Failure rates were estimated from visual inspection. ICV of each scan was estimated and and an intraclass correlation (ICC) was estimated using a one-way ANOVA. Results Smoothing images improves brain extraction results using BET for all measures except specificity (all p < 0.01, uncorrected), irrespective of the FI threshold. Using an FI of 0.01 or 0.1 performed better than 0.35. Thus, all reported results refer only to smoothed data using an FI of 0.01 or 0.1. Using an FI of 0.01 had a higher median sensitivity (0.9901) than an FI of 0.1 (0.9884, median difference: 0.0014, p < 0.001), accuracy (0.9971 vs. 0.9971; median difference: 0.0001, p < 0.001), and DSI (0.9895 vs. 0.9894; median difference: 0.0004, p < 0.001) and lower specificity (0.9981 vs. 0.9982; median difference: −0.0001, p < 0.001). These measures are all very high indicating that a range of FI values may produce visually indistinguishable brain extractions. Using smoothed data and an FI of 0.01, the mean (SD) ICV ratio was 1.002 (0.008); the mean being close to 1 indicates the ICV estimates are similar for automated and manual segmentation. In the 1095 longitudinal scans, this pipeline had a low failure rate (5.2%) and the ICC estimate was high (0.929, 95% CI: 0.91, 0.945) for successfully extracted brains. Conclusion BET performs well at brain extraction on thresholded, 1mm3 smoothed CT images with an FI of 0.01 or 0.1. Smoothing before applying BET is an important step not previously discussed in the literature. Analysis code is provided. PMID:25862260
NASA Astrophysics Data System (ADS)
Vasuki, Yathunanthan; Holden, Eun-Jung; Kovesi, Peter; Micklethwaite, Steven
2014-08-01
Recent advances in data acquisition technologies, such as Unmanned Aerial Vehicles (UAVs), have led to a growing interest in capturing high-resolution rock surface images. However, due to the large volumes of data that can be captured in a short flight, efficient analysis of this data brings new challenges, especially the time it takes to digitise maps and extract orientation data. We outline a semi-automated method that allows efficient mapping of geological faults using photogrammetric data of rock surfaces, which was generated from aerial photographs collected by a UAV. Our method harnesses advanced automated image analysis techniques and human data interaction to rapidly map structures and then calculate their dip and dip directions. Geological structures (faults, joints and fractures) are first detected from the primary photographic dataset and the equivalent three dimensional (3D) structures are then identified within a 3D surface model generated by structure from motion (SfM). From this information the location, dip and dip direction of the geological structures are calculated. A structure map generated by our semi-automated method obtained a recall rate of 79.8% when compared against a fault map produced using expert manual digitising and interpretation methods. The semi-automated structure map was produced in 10 min whereas the manual method took approximately 7 h. In addition, the dip and dip direction calculation, using our automated method, shows a mean±standard error of 1.9°±2.2° and 4.4°±2.6° respectively with field measurements. This shows the potential of using our semi-automated method for accurate and efficient mapping of geological structures, particularly from remote, inaccessible or hazardous sites.
Fusion of monocular cues to detect man-made structures in aerial imagery
NASA Technical Reports Server (NTRS)
Shufelt, Jefferey; Mckeown, David M.
1991-01-01
The extraction of buildings from aerial imagery is a complex problem for automated computer vision. It requires locating regions in a scene that possess properties distinguishing them as man-made objects as opposed to naturally occurring terrain features. It is reasonable to assume that no single detection method can correctly delineate or verify buildings in every scene. A cooperative-methods paradigm is useful in approaching the building extraction problem. Using this paradigm, each extraction technique provides information which can be added or assimilated into an overall interpretation of the scene. Thus, the main objective is to explore the development of computer vision system that integrates the results of various scene analysis techniques into an accurate and robust interpretation of the underlying three dimensional scene. The problem of building hypothesis fusion in aerial imagery is discussed. Building extraction techniques are briefly surveyed, including four building extraction, verification, and clustering systems. A method for fusing the symbolic data generated by these systems is described, and applied to monocular image and stereo image data sets. Evaluation methods for the fusion results are described, and the fusion results are analyzed using these methods.
NASA Astrophysics Data System (ADS)
Bredfeldt, Jeremy S.; Liu, Yuming; Pehlke, Carolyn A.; Conklin, Matthew W.; Szulczewski, Joseph M.; Inman, David R.; Keely, Patricia J.; Nowak, Robert D.; Mackie, Thomas R.; Eliceiri, Kevin W.
2014-01-01
Second-harmonic generation (SHG) imaging can help reveal interactions between collagen fibers and cancer cells. Quantitative analysis of SHG images of collagen fibers is challenged by the heterogeneity of collagen structures and low signal-to-noise ratio often found while imaging collagen in tissue. The role of collagen in breast cancer progression can be assessed post acquisition via enhanced computation. To facilitate this, we have implemented and evaluated four algorithms for extracting fiber information, such as number, length, and curvature, from a variety of SHG images of collagen in breast tissue. The image-processing algorithms included a Gaussian filter, SPIRAL-TV filter, Tubeness filter, and curvelet-denoising filter. Fibers are then extracted using an automated tracking algorithm called fiber extraction (FIRE). We evaluated the algorithm performance by comparing length, angle and position of the automatically extracted fibers with those of manually extracted fibers in twenty-five SHG images of breast cancer. We found that the curvelet-denoising filter followed by FIRE, a process we call CT-FIRE, outperforms the other algorithms under investigation. CT-FIRE was then successfully applied to track collagen fiber shape changes over time in an in vivo mouse model for breast cancer.
KAM (Knowledge Acquisition Module): A tool to simplify the knowledge acquisition process
NASA Technical Reports Server (NTRS)
Gettig, Gary A.
1988-01-01
Analysts, knowledge engineers and information specialists are faced with increasing volumes of time-sensitive data in text form, either as free text or highly structured text records. Rapid access to the relevant data in these sources is essential. However, due to the volume and organization of the contents, and limitations of human memory and association, frequently: (1) important information is not located in time; (2) reams of irrelevant data are searched; and (3) interesting or critical associations are missed due to physical or temporal gaps involved in working with large files. The Knowledge Acquisition Module (KAM) is a microcomputer-based expert system designed to assist knowledge engineers, analysts, and other specialists in extracting useful knowledge from large volumes of digitized text and text-based files. KAM formulates non-explicit, ambiguous, or vague relations, rules, and facts into a manageable and consistent formal code. A library of system rules or heuristics is maintained to control the extraction of rules, relations, assertions, and other patterns from the text. These heuristics can be added, deleted or customized by the user. The user can further control the extraction process with optional topic specifications. This allows the user to cluster extracts based on specific topics. Because KAM formalizes diverse knowledge, it can be used by a variety of expert systems and automated reasoning applications. KAM can also perform important roles in computer-assisted training and skill development. Current research efforts include the applicability of neural networks to aid in the extraction process and the conversion of these extracts into standard formats.
Ghani, Milad; Font Picó, Maria Francesca; Salehinia, Shima; Palomino Cabello, Carlos; Maya, Fernando; Berlier, Gloria; Saraji, Mohammad; Cerdà, Víctor; Turnes Palomino, Gemma
2017-03-10
We present for the first time the application of metal-organic framework (MOF) mixed-matrix disks (MMD) for the automated flow-through solid-phase extraction (SPE) of environmental pollutants. Zirconium terephthalate UiO-66 and UiO-66-NH 2 MOFs with different size (90, 200 and 300nm) have been incorporated into mechanically stable polyvinylidene difluoride (PVDF) disks. The performance of the MOF-MMDs for automated SPE of seven substituted phenols prior to HPLC analysis has been evaluated using the sequential injection analysis technique. MOF-MMDs enabled the simultaneous extraction of phenols with the concomitant size exclusion of molecules of larger size. The best extraction performance was obtained using a MOF-MMD containing 90nm UiO-66-NH 2 crystals. Using the selected MOF-MMD, detection limits ranging from 0.1 to 0.2μgL -1 were obtained. Relative standard deviations ranged from 3.9 to 5.3% intra-day, and 4.7-5.7% inter-day. Membrane batch-to-batch reproducibility was from 5.2 to 6.4%. Three different groundwater samples were analyzed with the proposed method using MOF-MMDs, obtaining recoveries ranging from 90 to 98% for all tested analytes. Copyright © 2017 Elsevier B.V. All rights reserved.
Sultana, Nadia; Gunning, Sean; Furst, Stephen J; Garrard, Kenneth P; Dow, Thomas A; Vinueza, Nelson R
2018-05-19
Textile fiber is a common form of transferable trace evidence at the crime scene. Different techniques such as microscopy or spectroscopy are currently being used for trace fiber analysis. Dye characterization in trace fiber adds an important molecular specificity during the analysis. In this study, we performed a direct trace fiber analysis method via dye characterization by a novel automated microfluidics device (MFD) dye extraction system coupled with a quadrupole-time-of-flight (Q-TOF) mass spectrometer (MS). The MFD system used an in-house made automated procedure which requires only 10μL of organic solvent for the extraction. The total extraction and identification time by the system is under 12min. A variety of sulfonated azo and anthraquinone dyes were analyzed from ∼1mm length nylon fiber samples. This methodology successfully characterized multiple dyes (≥3 dyes) from a single fiber thread. Additionally, it was possible to do dye characterization from single fibers with a diameter of ∼10μm. The MFD-MS system was used for elemental composition and isotopic distribution analysis where MFD-MS/MS was used for structural characterization of dyes on fibers. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Roeth, O.; Zaum, D.; Brenner, C.
2017-05-01
Highly automated driving (HAD) requires maps not only of high spatial precision but also of yet unprecedented actuality. Traditionally small highly specialized fleets of measurement vehicles are used to generate such maps. Nevertheless, for achieving city-wide or even nation-wide coverage, automated map update mechanisms based on very large vehicle fleet data gain importance since highly frequent measurements are only to be obtained using such an approach. Furthermore, the processing of imprecise mass data in contrast to few dedicated highly accurate measurements calls for a high degree of automation. We present a method for the generation of lane-accurate road network maps from vehicle trajectory data (GPS or better). Our approach therefore allows for exploiting today's connected vehicle fleets for the generation of HAD maps. The presented algorithm is based on elementary building blocks which guarantees useful lane models and uses a Reversible Jump Markov chain Monte Carlo method to explore the models parameters in order to reconstruct the one most likely emitting the input data. The approach is applied to a challenging urban real-world scenario of different trajectory accuracy levels and is evaluated against a LIDAR-based ground truth map.
Sun, Wanxin; Chang, Shi; Tai, Dean C S; Tan, Nancy; Xiao, Guangfa; Tang, Huihuan; Yu, Hanry
2008-01-01
Liver fibrosis is associated with an abnormal increase in an extracellular matrix in chronic liver diseases. Quantitative characterization of fibrillar collagen in intact tissue is essential for both fibrosis studies and clinical applications. Commonly used methods, histological staining followed by either semiquantitative or computerized image analysis, have limited sensitivity, accuracy, and operator-dependent variations. The fibrillar collagen in sinusoids of normal livers could be observed through second-harmonic generation (SHG) microscopy. The two-photon excited fluorescence (TPEF) images, recorded simultaneously with SHG, clearly revealed the hepatocyte morphology. We have systematically optimized the parameters for the quantitative SHG/TPEF imaging of liver tissue and developed fully automated image analysis algorithms to extract the information of collagen changes and cell necrosis. Subtle changes in the distribution and amount of collagen and cell morphology are quantitatively characterized in SHG/TPEF images. By comparing to traditional staining, such as Masson's trichrome and Sirius red, SHG/TPEF is a sensitive quantitative tool for automated collagen characterization in liver tissue. Our system allows for enhanced detection and quantification of sinusoidal collagen fibers in fibrosis research and clinical diagnostics.
Vivekanandhan, Sapthagirivasan; Subramaniam, Janarthanam; Mariamichael, Anburajan
2016-10-01
Hip fractures due to osteoporosis are increasing progressively across the globe. It is also difficult for those fractured patients to undergo dual-energy X-ray absorptiometry scans due to its complicated protocol and its associated cost. The utilisation of computed tomography for the fracture treatment has become common in the clinical practice. It would be helpful for orthopaedic clinicians, if they could get some additional information related to bone strength for better treatment planning. The aim of our study was to develop an automated system to segment the femoral neck region, extract the cortical and trabecular bone parameters, and assess the bone strength using an isotropic volume construction from clinical computed tomography images. The right hip computed tomography and right femur dual-energy X-ray absorptiometry measurements were taken from 50 south-Indian females aged 30-80 years. Each computed tomography image volume was re-constructed to form isotropic volumes. An automated system by incorporating active contour models was used to segment the neck region. A minimum distance boundary method was applied to isolate the cortical and trabecular bone components. The trabecular bone was enhanced and segmented using trabecular enrichment approach. The cortical and trabecular bone features were extracted and statistically compared with dual-energy X-ray absorptiometry measured femur neck bone mineral density. The extracted bone measures demonstrated a significant correlation with neck bone mineral density (r > 0.7, p < 0.001). The inclusion of cortical measures, along with the trabecular measures extracted after isotropic volume construction and trabecular enrichment approach procedures, resulted in better estimation of bone strength. The findings suggest that the proposed system using the clinical computed tomography images scanned with low dose could eventually be helpful in osteoporosis diagnosis and its treatment planning. © IMechE 2016.
Medical ADP Systems: Automated Medical Records Hold Promise to Improve Patient Care
1991-01-01
automated medical records. The report discusses the potential benefits that automation could make to the quality of patient care and the factors that impede...information systems, but no organization has fully automated one of the most critical types of information, patient medical records. The patient medical record...its review of automated medical records. GAO’s objectives in this study were to identify the (1) benefits of automating patient records and (2) factors
Ma, Jian; Casey, Cameron P.; Zheng, Xueyun; Ibrahim, Yehia M.; Wilkins, Christopher S.; Renslow, Ryan S.; Thomas, Dennis G.; Payne, Samuel H.; Monroe, Matthew E.; Smith, Richard D.; Teeguarden, Justin G.; Baker, Erin S.; Metz, Thomas O.
2017-01-01
Abstract Motivation: Drift tube ion mobility spectrometry coupled with mass spectrometry (DTIMS-MS) is increasingly implemented in high throughput omics workflows, and new informatics approaches are necessary for processing the associated data. To automatically extract arrival times for molecules measured by DTIMS at multiple electric fields and compute their associated collisional cross sections (CCS), we created the PNNL Ion Mobility Cross Section Extractor (PIXiE). The primary application presented for this algorithm is the extraction of data that can then be used to create a reference library of experimental CCS values for use in high throughput omics analyses. Results: We demonstrate the utility of this approach by automatically extracting arrival times and calculating the associated CCSs for a set of endogenous metabolites and xenobiotics. The PIXiE-generated CCS values were within error of those calculated using commercially available instrument vendor software. Availability and implementation: PIXiE is an open-source tool, freely available on Github. The documentation, source code of the software, and a GUI can be found at https://github.com/PNNL-Comp-Mass-Spec/PIXiE and the source code of the backend workflow library used by PIXiE can be found at https://github.com/PNNL-Comp-Mass-Spec/IMS-Informed-Library. Contact: erin.baker@pnnl.gov or thomas.metz@pnnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28505286
Electronic Data Interchange in Procurement
1990-04-01
contract management and order processing systems. This conversion of automated information to paper and back to automated form is not only slow and...automated purchasing computer and the contractor’s order processing computer through telephone lines, as illustrated in Figure 1-1. Computer-to-computer...into the contractor’s order processing or contract management system. This approach - converting automated information to paper and back to automated
EEG artifact elimination by extraction of ICA-component features using image processing algorithms.
Radüntz, T; Scouten, J; Hochmuth, O; Meffert, B
2015-03-30
Artifact rejection is a central issue when dealing with electroencephalogram recordings. Although independent component analysis (ICA) separates data in linearly independent components (IC), the classification of these components as artifact or EEG signal still requires visual inspection by experts. In this paper, we achieve automated artifact elimination using linear discriminant analysis (LDA) for classification of feature vectors extracted from ICA components via image processing algorithms. We compare the performance of this automated classifier to visual classification by experts and identify range filtering as a feature extraction method with great potential for automated IC artifact recognition (accuracy rate 88%). We obtain almost the same level of recognition performance for geometric features and local binary pattern (LBP) features. Compared to the existing automated solutions the proposed method has two main advantages: First, it does not depend on direct recording of artifact signals, which then, e.g. have to be subtracted from the contaminated EEG. Second, it is not limited to a specific number or type of artifact. In summary, the present method is an automatic, reliable, real-time capable and practical tool that reduces the time intensive manual selection of ICs for artifact removal. The results are very promising despite the relatively small channel resolution of 25 electrodes. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Mao, Jin; Moore, Lisa R; Blank, Carrine E; Wu, Elvis Hsin-Hui; Ackerman, Marcia; Ranade, Sonali; Cui, Hong
2016-12-13
The large-scale analysis of phenomic data (i.e., full phenotypic traits of an organism, such as shape, metabolic substrates, and growth conditions) in microbial bioinformatics has been hampered by the lack of tools to rapidly and accurately extract phenotypic data from existing legacy text in the field of microbiology. To quickly obtain knowledge on the distribution and evolution of microbial traits, an information extraction system needed to be developed to extract phenotypic characters from large numbers of taxonomic descriptions so they can be used as input to existing phylogenetic analysis software packages. We report the development and evaluation of Microbial Phenomics Information Extractor (MicroPIE, version 0.1.0). MicroPIE is a natural language processing application that uses a robust supervised classification algorithm (Support Vector Machine) to identify characters from sentences in prokaryotic taxonomic descriptions, followed by a combination of algorithms applying linguistic rules with groups of known terms to extract characters as well as character states. The input to MicroPIE is a set of taxonomic descriptions (clean text). The output is a taxon-by-character matrix-with taxa in the rows and a set of 42 pre-defined characters (e.g., optimum growth temperature) in the columns. The performance of MicroPIE was evaluated against a gold standard matrix and another student-made matrix. Results show that, compared to the gold standard, MicroPIE extracted 21 characters (50%) with a Relaxed F1 score > 0.80 and 16 characters (38%) with Relaxed F1 scores ranging between 0.50 and 0.80. Inclusion of a character prediction component (SVM) improved the overall performance of MicroPIE, notably the precision. Evaluated against the same gold standard, MicroPIE performed significantly better than the undergraduate students. MicroPIE is a promising new tool for the rapid and efficient extraction of phenotypic character information from prokaryotic taxonomic descriptions. However, further development, including incorporation of ontologies, will be necessary to improve the performance of the extraction for some character types.
Information Retrieval (SPIRES) and Library Automation (BALLOTS) at Stanford University.
ERIC Educational Resources Information Center
Ferguson, Douglas, Ed.
At Stanford University, two major projects have been involved jointly in library automation and information retrieval since 1968: BALLOTS (Bibliographic Automation of Large Library Operations) and SPIRES (Stanford Physics Information Retrieval System). In early 1969, two prototype applications were activated using the jointly developed systems…
NASA Astrophysics Data System (ADS)
Zhou, Y.; Zhao, H.; Hao, H.; Wang, C.
2018-05-01
Accurate remote sensing water extraction is one of the primary tasks of watershed ecological environment study. Since the Yanhe water system has typical characteristics of a small water volume and narrow river channel, which leads to the difficulty for conventional water extraction methods such as Normalized Difference Water Index (NDWI). A new Multi-Spectral Threshold segmentation of the NDWI (MST-NDWI) water extraction method is proposed to achieve the accurate water extraction in Yanhe watershed. In the MST-NDWI method, the spectral characteristics of water bodies and typical backgrounds on the Landsat/TM images have been evaluated in Yanhe watershed. The multi-spectral thresholds (TM1, TM4, TM5) based on maximum-likelihood have been utilized before NDWI water extraction to realize segmentation for a division of built-up lands and small linear rivers. With the proposed method, a water map is extracted from the Landsat/TM images in 2010 in China. An accuracy assessment is conducted to compare the proposed method with the conventional water indexes such as NDWI, Modified NDWI (MNDWI), Enhanced Water Index (EWI), and Automated Water Extraction Index (AWEI). The result shows that the MST-NDWI method generates better water extraction accuracy in Yanhe watershed and can effectively diminish the confusing background objects compared to the conventional water indexes. The MST-NDWI method integrates NDWI and Multi-Spectral Threshold segmentation algorithms, with richer valuable information and remarkable results in accurate water extraction in Yanhe watershed.
Text Mining in Biomedical Domain with Emphasis on Document Clustering.
Renganathan, Vinaitheerthan
2017-07-01
With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.
Campone, Luca; Piccinelli, Anna Lisa; Celano, Rita; Russo, Mariateresa; Valdés, Alberto; Ibáñez, Clara; Rastrelli, Luca
2015-04-01
According to current demands and future perspectives in food safety, this study reports a fast and fully automated analytical method for the simultaneous analysis of the mycotoxins with high toxicity and wide spread, aflatoxins (AFs) and ochratoxin A (OTA) in dried fruits, a high-risk foodstuff. The method is based on pressurized liquid extraction (PLE), with aqueous methanol (30%) at 110 °C, of the slurried dried fruit and online solid-phase extraction (online SPE) cleanup of the PLE extracts with a C18 cartridge. The purified sample was directly analysed by ultra-high-pressure liquid chromatography-tandem mass spectrometry (UHPLC-MS/MS) for sensitive and selective determination of AFs and OTA. The proposed analytical procedure was validated for different dried fruits (vine fruit, fig and apricot), providing method detection and quantification limits much lower than the AFs and OTA maximum levels imposed by EU regulation in dried fruit for direct human consumption. Also, recoveries (83-103%) and repeatability (RSD < 8, n = 3) meet the performance criteria required by EU regulation for the determination of the levels of mycotoxins in foodstuffs. The main advantage of the proposed method is full automation of the whole analytical procedure that reduces the time and cost of the analysis, sample manipulation and solvent consumption, enabling high-throughput analysis and highly accurate and precise results.
Object extraction in photogrammetric computer vision
NASA Astrophysics Data System (ADS)
Mayer, Helmut
This paper discusses state and promising directions of automated object extraction in photogrammetric computer vision considering also practical aspects arising for digital photogrammetric workstations (DPW). A review of the state of the art shows that there are only few practically successful systems on the market. Therefore, important issues for a practical success of automated object extraction are identified. A sound and most important powerful theoretical background is the basis. Here, we particularly point to statistical modeling. Testing makes clear which of the approaches are suited best and how useful they are for praxis. A key for commercial success of a practical system is efficient user interaction. As the means for data acquisition are changing, new promising application areas such as extremely detailed three-dimensional (3D) urban models for virtual television or mission rehearsal evolve.
Automated extraction of subdural electrode grid from post-implant MRI scans for epilepsy surgery
NASA Astrophysics Data System (ADS)
Pozdin, Maksym A.; Skrinjar, Oskar
2005-04-01
This paper presents an automated algorithm for extraction of Subdural Electrode Grid (SEG) from post-implant MRI scans for epilepsy surgery. Post-implant MRI scans are corrupted by the image artifacts caused by implanted electrodes. The artifacts appear as dark spherical voids and given that the cerebrospinal fluid is also dark in T1-weigthed MRI scans, it is a difficult and time-consuming task to manually locate SEG position relative to brain structures of interest. The proposed algorithm reliably and accurately extracts SEG from post-implant MRI scan, i.e. finds its shape and position relative to brain structures of interest. The algorithm was validated against manually determined electrode locations, and the average error was 1.6mm for the three tested subjects.
Wavelet analysis for wind fields estimation.
Leite, Gladeston C; Ushizima, Daniela M; Medeiros, Fátima N S; de Lima, Gilson G
2010-01-01
Wind field analysis from synthetic aperture radar images allows the estimation of wind direction and speed based on image descriptors. In this paper, we propose a framework to automate wind direction retrieval based on wavelet decomposition associated with spectral processing. We extend existing undecimated wavelet transform approaches, by including à trous with B(3) spline scaling function, in addition to other wavelet bases as Gabor and Mexican-hat. The purpose is to extract more reliable directional information, when wind speed values range from 5 to 10 ms(-1). Using C-band empirical models, associated with the estimated directional information, we calculate local wind speed values and compare our results with QuikSCAT scatterometer data. The proposed approach has potential application in the evaluation of oil spills and wind farms.
Theory research of seam recognition and welding torch pose control based on machine vision
NASA Astrophysics Data System (ADS)
Long, Qiang; Zhai, Peng; Liu, Miao; He, Kai; Wang, Chunyang
2017-03-01
At present, the automation requirement of the welding become higher, so a method of the welding information extraction by vision sensor is proposed in this paper, and the simulation with the MATLAB has been conducted. Besides, in order to improve the quality of robot automatic welding, an information retrieval method for welding torch pose control by visual sensor is attempted. Considering the demands of welding technology and engineering habits, the relative coordinate systems and variables are strictly defined, and established the mathematical model of the welding pose, and verified its feasibility by using the MATLAB simulation in the paper, these works lay a foundation for the development of welding off-line programming system with high precision and quality.
National Aeronautics and Space Administration's (NASA) Automated Information Security Handbook
NASA Technical Reports Server (NTRS)
Roback, E.
1991-01-01
The NASA Automated Information Security Handbook provides NASA's overall approach to automated information systems security including discussions of such aspects as: program goals and objectives, assignment of responsibilities, risk assessment, foreign national access, contingency planning and disaster recovery, awareness training, procurement, certification, planning, and special considerations for microcomputers.
NASA Technical Reports Server (NTRS)
Maille, Nicolas P.; Statler, Irving C.; Ferryman, Thomas A.; Rosenthal, Loren; Shafto, Michael G.; Statler, Irving C.
2006-01-01
The objective of the Aviation System Monitoring and Modeling (ASMM) project of NASA s Aviation Safety and Security Program was to develop technologies that will enable proactive management of safety risk, which entails identifying the precursor events and conditions that foreshadow most accidents. This presents a particular challenge in the aviation system where people are key components and human error is frequently cited as a major contributing factor or cause of incidents and accidents. In the aviation "world", information about what happened can be extracted from quantitative data sources, but the experiential account of the incident reporter is the best available source of information about why an incident happened. This report describes a conceptual model and an approach to automated analyses of textual data sources for the subjective perspective of the reporter of the incident to aid in understanding why an incident occurred. It explores a first-generation process for routinely searching large databases of textual reports of aviation incident or accidents, and reliably analyzing them for causal factors of human behavior (the why of an incident). We have defined a generic structure of information that is postulated to be a sound basis for defining similarities between aviation incidents. Based on this structure, we have introduced the simplifying structure, which we call the Scenario as a pragmatic guide for identifying similarities of what happened based on the objective parameters that define the Context and the Outcome of a Scenario. We believe that it will be possible to design an automated analysis process guided by the structure of the Scenario that will aid aviation-safety experts to understand the systemic issues that are conducive to human error.
Greenspoon, S A; Sykes, K L V; Ban, J D; Pollard, A; Baisden, M; Farr, M; Graham, N; Collins, B L; Green, M M; Christenson, C C
2006-12-20
Human genome, pharmaceutical and research laboratories have long enjoyed the application of robotics to performing repetitive laboratory tasks. However, the utilization of robotics in forensic laboratories for processing casework samples is relatively new and poses particular challenges. Since the quantity and quality (a mixture versus a single source sample, the level of degradation, the presence of PCR inhibitors) of the DNA contained within a casework sample is unknown, particular attention must be paid to procedural susceptibility to contamination, as well as DNA yield, especially as it pertains to samples with little biological material. The Virginia Department of Forensic Science (VDFS) has successfully automated forensic casework DNA extraction utilizing the DNA IQ(trade mark) System in conjunction with the Biomek 2000 Automation Workstation. Human DNA quantitation is also performed in a near complete automated fashion utilizing the AluQuant Human DNA Quantitation System and the Biomek 2000 Automation Workstation. Recently, the PCR setup for casework samples has been automated, employing the Biomek 2000 Automation Workstation and Normalization Wizard, Genetic Identity version, which utilizes the quantitation data, imported into the software, to create a customized automated method for DNA dilution, unique to that plate of DNA samples. The PCR Setup software method, used in conjunction with the Normalization Wizard method and written for the Biomek 2000, functions to mix the diluted DNA samples, transfer the PCR master mix, and transfer the diluted DNA samples to PCR amplification tubes. Once the process is complete, the DNA extracts, still on the deck of the robot in PCR amplification strip tubes, are transferred to pre-labeled 1.5 mL tubes for long-term storage using an automated method. The automation of these steps in the process of forensic DNA casework analysis has been accomplished by performing extensive optimization, validation and testing of the software methods.
New techniques for positron emission tomography in the study of human neurological disorders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuhl, D.E.
1992-07-01
The general goals of the physics and kinetic modeling projects are to: (1) improve the quantitative information extractable from PET images, and (2) develop, implement and optimize tracer kinetic models for new PET neurotransmitter/receptor ligands aided by computer simulations. Work towards improving PET quantification has included projects evaluating: (1) iterative reconstruction algorithms using supplemental boundary information, (2) automated registration of dynamic PET emission and transmission data using sinogram edge detection, and (3) automated registration of multiple subjects to a common coordinate system, including the use of non-linear warping methods. Simulation routines have been developed providing more accurate representation of datamore » generated from neurotransmitter/receptor studies. Routines consider data generated from complex compartmental models, high or low specific activity administrations, non-specific binding, pre- or post-injection of cold or competing ligands, temporal resolution of the data, and radiolabeled metabolites. Computer simulations and human PET studies have been performed to optimize kinetic models for four new neurotransmitter/receptor ligands, [{sup 11}C]TRB (muscarinic), [{sup 11}C]flumazenil (benzodiazepine), [{sup 18}F]GBR12909, (dopamine), and [{sup 11}C]NMPB (muscarinic).« less
New techniques for positron emission tomography in the study of human neurological disorders
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuhl, D.E.
1992-01-01
The general goals of the physics and kinetic modeling projects are to: (1) improve the quantitative information extractable from PET images, and (2) develop, implement and optimize tracer kinetic models for new PET neurotransmitter/receptor ligands aided by computer simulations. Work towards improving PET quantification has included projects evaluating: (1) iterative reconstruction algorithms using supplemental boundary information, (2) automated registration of dynamic PET emission and transmission data using sinogram edge detection, and (3) automated registration of multiple subjects to a common coordinate system, including the use of non-linear warping methods. Simulation routines have been developed providing more accurate representation of datamore » generated from neurotransmitter/receptor studies. Routines consider data generated from complex compartmental models, high or low specific activity administrations, non-specific binding, pre- or post-injection of cold or competing ligands, temporal resolution of the data, and radiolabeled metabolites. Computer simulations and human PET studies have been performed to optimize kinetic models for four new neurotransmitter/receptor ligands, ({sup 11}C)TRB (muscarinic), ({sup 11}C)flumazenil (benzodiazepine), ({sup 18}F)GBR12909, (dopamine), and ({sup 11}C)NMPB (muscarinic).« less
NASA Technical Reports Server (NTRS)
Ferryman, Thomas A.; Posse, Christian; Rosenthal, Loren J.; Srivastava, Ashok N.; Statler, Irving C.
2006-01-01
The objective of the Aviation System Monitoring and Modeling project of NASA's Aviation Safety and Security Program was to develop technologies to enable proactive management of safety risk, which entails identifying the precursor events and conditions that foreshadow most accidents. Information about what happened can be extracted from quantitative data sources, but the experiential account of the incident reporter is the best available source of information about why an incident happened. In Volume I, the concept of the Scenario was introduced as a pragmatic guide for identifying similarities of what happened based on the objective parameters that define the Context and the Outcome of a Scenario. In this Volume II, that study continues into the analyses of the free narratives to gain understanding as to why the incident occurred from the reporter s perspective. While this is just the first experiment, the results of our approach are encouraging and indicate that it will be possible to design an automated analysis process guided by the structure of the Scenario that can achieve the level of consistency and reliability of human analysis of narrative reports.
Remote imagery for unmanned ground vehicles: the future of path planning for ground robotics
NASA Astrophysics Data System (ADS)
Frederick, Philip A.; Theisen, Bernard L.; Ward, Derek
2006-10-01
Remote Imagery for Unmanned Ground Vehicles (RIUGV) uses a combination of high-resolution multi-spectral satellite imagery and advanced commercial off-the-self (COTS) object-oriented image processing software to provide automated terrain feature extraction and classification. This information, along with elevation data, infrared imagery, a vehicle mobility model and various meta-data (local weather reports, Zobler Soil map, etc...), is fed into automated path planning software to provide a stand-alone ability to generate rapidly updateable dynamic mobility maps for Manned or Unmanned Ground Vehicles (MGVs or UGVs). These polygon based mobility maps can reside on an individual platform or a tactical network. When new information is available, change files are generated and ingested into existing mobility maps based on user selected criteria. Bandwidth concerns are mitigated by the use of shape files for the representation of the data (e.g. each object in the scene is represented by a shape file and thus can be transmitted individually). User input (desired level of stealth, required time of arrival, etc...) determines the priority in which objects are tagged for updates. This paper will also discuss the planned July 2006 field experiment.
Fish, Kenneth N; Sweet, Robert A; Deo, Anthony J; Lewis, David A
2008-11-13
A number of human brain diseases have been associated with disturbances in the structure and function of cortical synapses. Answering fundamental questions about the synaptic machinery in these disease states requires the ability to image and quantify small synaptic structures in tissue sections and to evaluate protein levels at these major sites of function. We developed a new automated segmentation imaging method specifically to answer such fundamental questions. The method takes advantage of advances in spinning disk confocal microscopy, and combines information from multiple iterations of a fluorescence intensity/morphological segmentation protocol to construct three-dimensional object masks of immunoreactive (IR) puncta. This new methodology is unique in that high- and low-fluorescing IR puncta are equally masked, allowing for quantification of the number of fluorescently-labeled puncta in tissue sections. In addition, the shape of the final object masks highly represents their corresponding original data. Thus, the object masks can be used to extract information about the IR puncta (e.g., average fluorescence intensity of proteins of interest). Importantly, the segmentation method presented can be easily adapted for use with most existing microscopy analysis packages.
Sweeney, Elizabeth M.; Vogelstein, Joshua T.; Cuzzocreo, Jennifer L.; Calabresi, Peter A.; Reich, Daniel S.; Crainiceanu, Ciprian M.; Shinohara, Russell T.
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance. PMID:24781953
Sweeney, Elizabeth M; Vogelstein, Joshua T; Cuzzocreo, Jennifer L; Calabresi, Peter A; Reich, Daniel S; Crainiceanu, Ciprian M; Shinohara, Russell T
2014-01-01
Machine learning is a popular method for mining and analyzing large collections of medical data. We focus on a particular problem from medical research, supervised multiple sclerosis (MS) lesion segmentation in structural magnetic resonance imaging (MRI). We examine the extent to which the choice of machine learning or classification algorithm and feature extraction function impacts the performance of lesion segmentation methods. As quantitative measures derived from structural MRI are important clinical tools for research into the pathophysiology and natural history of MS, the development of automated lesion segmentation methods is an active research field. Yet, little is known about what drives performance of these methods. We evaluate the performance of automated MS lesion segmentation methods, which consist of a supervised classification algorithm composed with a feature extraction function. These feature extraction functions act on the observed T1-weighted (T1-w), T2-weighted (T2-w) and fluid-attenuated inversion recovery (FLAIR) MRI voxel intensities. Each MRI study has a manual lesion segmentation that we use to train and validate the supervised classification algorithms. Our main finding is that the differences in predictive performance are due more to differences in the feature vectors, rather than the machine learning or classification algorithms. Features that incorporate information from neighboring voxels in the brain were found to increase performance substantially. For lesion segmentation, we conclude that it is better to use simple, interpretable, and fast algorithms, such as logistic regression, linear discriminant analysis, and quadratic discriminant analysis, and to develop the features to improve performance.
Kema, I P; Meijer, W G; Meiborg, G; Ooms, B; Willemse, P H; de Vries, E G
2001-10-01
Profiling of the plasma indoles tryptophan, 5-hydroxytryptophan (5-HTP), serotonin, and 5-hydroxyindoleacetic acid (5-HIAA) is useful in the diagnosis and follow-up of patients with carcinoid tumors. We describe an automated method for the profiling of these indoles in protein-containing matrices as well as the plasma indole concentrations in healthy controls and patients with carcinoid tumors. Plasma, cerebrospinal fluid, and tissue homogenates were prepurified by automated on-line solid-phase extraction (SPE) in Hysphere Resin SH SPE cartridges containing strong hydrophobic polystyrene resin. Analytes were eluted from the SPE cartridge by column switching. Subsequent separation and detection were performed by reversed-phase HPLC combined with fluorometric detection in a total cycle time of 20 min. We obtained samples from 14 healthy controls and 17 patients with metastasized midgut carcinoid tumors for plasma indole analysis. In the patient group, urinary excretion of 5-HIAA and serotonin was compared with concentrations of plasma indoles. Within- and between-series CVs for indoles in platelet-rich plasma were 0.6-6.2% and 3.7-12%, respectively. Results for platelet-rich plasma serotonin compared favorably with those obtained by single-component analysis. Plasma 5-HIAA, but not 5-HTP was detectable in 8 of 17 patients with carcinoid tumors. In the patient group, platelet-rich plasma total tryptophan correlated negatively with platelet-rich plasma serotonin (P = 0.021; r = -0.56), urinary 5-HIAA (P = 0.003; r = -0.68), and urinary serotonin (P <0.0001; r = -0.80). The present chromatographic approach reduces analytical variation and time needed for analysis and gives more detailed information about metabolic deviations in indole metabolism than do manual, single-component analyses.
Griffanti, Ludovica; Zamboni, Giovanna; Khan, Aamira; Li, Linxin; Bonifacio, Guendalina; Sundaresan, Vaanathi; Schulz, Ursula G; Kuker, Wilhelm; Battaglini, Marco; Rothwell, Peter M; Jenkinson, Mark
2016-11-01
Reliable quantification of white matter hyperintensities of presumed vascular origin (WMHs) is increasingly needed, given the presence of these MRI findings in patients with several neurological and vascular disorders, as well as in elderly healthy subjects. We present BIANCA (Brain Intensity AbNormality Classification Algorithm), a fully automated, supervised method for WMH detection, based on the k-nearest neighbour (k-NN) algorithm. Relative to previous k-NN based segmentation methods, BIANCA offers different options for weighting the spatial information, local spatial intensity averaging, and different options for the choice of the number and location of the training points. BIANCA is multimodal and highly flexible so that the user can adapt the tool to their protocol and specific needs. We optimised and validated BIANCA on two datasets with different MRI protocols and patient populations (a "predominantly neurodegenerative" and a "predominantly vascular" cohort). BIANCA was first optimised on a subset of images for each dataset in terms of overlap and volumetric agreement with a manually segmented WMH mask. The correlation between the volumes extracted with BIANCA (using the optimised set of options), the volumes extracted from the manual masks and visual ratings showed that BIANCA is a valid alternative to manual segmentation. The optimised set of options was then applied to the whole cohorts and the resulting WMH volume estimates showed good correlations with visual ratings and with age. Finally, we performed a reproducibility test, to evaluate the robustness of BIANCA, and compared BIANCA performance against existing methods. Our findings suggest that BIANCA, which will be freely available as part of the FSL package, is a reliable method for automated WMH segmentation in large cross-sectional cohort studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Sang-Mook Lee; Neil A. Clark; Philip A. Araman
2003-01-01
Foliage transparency in trees is an important indicator for forest health assessment. This paper helps advance transparency measurement research by presenting methods of automatic tree boundary extraction and foliage transparency estimation from digital images taken from the ground of open grown trees.Extraction of proper boundaries of tree crowns is the...
Liljeqvist, Gösta T H; Staff, Michael; Puech, Michele; Blom, Hans; Torvaldsen, Siranda
2011-06-06
Influenza intelligence in New South Wales (NSW), Australia is derived mainly from emergency department (ED) presentations and hospital and intensive care admissions, which represent only a portion of influenza-like illness (ILI) in the population. A substantial amount of the remaining data lies hidden in general practice (GP) records. Previous attempts in Australia to gather ILI data from GPs have given them extra work. We explored the possibility of applying automated data extraction from GP records in sentinel surveillance in an Australian setting.The two research questions asked in designing the study were: Can syndromic ILI data be extracted automatically from routine GP data? How do ILI trends in sentinel general practice compare with ILI trends in EDs? We adapted a software program already capable of automated data extraction to identify records of patients with ILI in routine electronic GP records in two of the most commonly used commercial programs. This tool was applied in sentinel sites to gather retrospective data for May-October 2007-2009 and in real-time for the same interval in 2010. The data were compared with that provided by the Public Health Real-time Emergency Department Surveillance System (PHREDSS) and with ED data for the same periods. The GP surveillance tool identified seasonal trends in ILI both retrospectively and in near real-time. The curve of seasonal ILI was more responsive and less volatile than that of PHREDSS on a local area level. The number of weekly ILI presentations ranged from 8 to 128 at GP sites and from 0 to 18 in EDs in non-pandemic years. Automated data extraction from routine GP records offers a means to gather data without introducing any additional work for the practitioner. Adding this method to current surveillance programs will enhance their ability to monitor ILI and to detect early warning signals of new ILI events.
Lehotay, Steven J; Han, Lijun; Sapozhnikova, Yelena
2016-01-01
This study demonstrated the application of an automated high-throughput mini-cartridge solid-phase extraction (mini-SPE) cleanup for the rapid low-pressure gas chromatography-tandem mass spectrometry (LPGC-MS/MS) analysis of pesticides and environmental contaminants in QuEChERS extracts of foods. Cleanup efficiencies and breakthrough volumes using different mini-SPE sorbents were compared using avocado, salmon, pork loin, and kale as representative matrices. Optimum extract load volume was 300 µL for the 45 mg mini-cartridges containing 20/12/12/1 (w/w/w/w) anh. MgSO 4 /PSA (primary secondary amine)/C 18 /CarbonX sorbents used in the final method. In method validation to demonstrate high-throughput capabilities and performance results, 230 spiked extracts of 10 different foods (apple, kiwi, carrot, kale, orange, black olive, wheat grain, dried basil, pork, and salmon) underwent automated mini-SPE cleanup and analysis over the course of 5 days. In all, 325 analyses for 54 pesticides and 43 environmental contaminants (3 analyzed together) were conducted using the 10 min LPGC-MS/MS method without changing the liner or retuning the instrument. Merely, 1 mg equivalent sample injected achieved <5 ng g -1 limits of quantification. With the use of internal standards, method validation results showed that 91 of the 94 analytes including pairs achieved satisfactory results (70-120 % recovery and RSD ≤ 25 %) in the 10 tested food matrices ( n = 160). Matrix effects were typically less than ±20 %, mainly due to the use of analyte protectants, and minimal human review of software data processing was needed due to summation function integration of analyte peaks. This study demonstrated that the automated mini-SPE + LPGC-MS/MS method yielded accurate results in rugged, high-throughput operations with minimal labor and data review.
Versatile electrophoresis-based self-test platform.
Guijt, Rosanne M
2015-03-01
Lab on a Chip technology offers the possibility to extract chemical information from a complex sample in a simple, automated way without the need for a laboratory setting. In the health care sector, this chemical information could be used as a diagnostic tool for example to inform dosing. In this issue, the research underpinning a family of electrophoresis-based point-of-care devices for self-testing of ionic analytes in various sample matrices is described [Electrophoresis 2015, 36, 712-721.]. Hardware, software, and methodological chances made to improve the overall analytical performance in terms of accuracy, precision, detection limit, and reliability are discussed. In addition to the main focus of lithium monitoring, new applications including the use of the platform for veterinary purposes, sodium, and for creatinine measurements are included. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Dawes, Martin; Pluye, Pierre; Shea, Laura; Grad, Roland; Greenberg, Arlene; Nie, Jian-Yun
2007-01-01
Information retrieval in primary care is becoming more difficult as the volume of medical information held in electronic databases expands. The lexical structure of this information might permit automatic indexing and improved retrieval. To determine the possibility of identifying the key elements of clinical studies, namely Patient-Population-Problem, Exposure-Intervention, Comparison, Outcome, Duration and Results (PECODR), from abstracts of medical journals. We used a convenience sample of 20 synopses from the journal Evidence-Based Medicine (EBM) and their matching original journal article abstracts obtained from PubMed. Three independent primary care professionals identified PECODR-related extracts of text. Rules were developed to define each PECODR element and the selection process of characters, words, phrases and sentences. From the extracts of text related to PECODR elements, potential lexical patterns that might help identify those elements were proposed and assessed using NVivo software. A total of 835 PECODR-related text extracts containing 41,263 individual text characters were identified from 20 EBM journal synopses. There were 759 extracts in the corresponding PubMed abstracts containing 31,947 characters. PECODR elements were found in nearly all abstracts and synopses with the exception of duration. There was agreement on 86.6% of the extracts from the 20 EBM synopses and 85.0% on the corresponding PubMed abstracts. After consensus this rose to 98.4% and 96.9% respectively. We found potential text patterns in the Comparison, Outcome and Results elements of both EBM synopses and PubMed abstracts. Some phrases and words are used frequently and are specific for these elements in both synopses and abstracts. Results suggest a PECODR-related structure exists in medical abstracts and that there might be lexical patterns specific to these elements. More sophisticated computer-assisted lexical-semantic analysis might refine these results, and pave the way to automating PECODR indexing, and improve information retrieval in primary care.
Document Form and Character Recognition using SVM
NASA Astrophysics Data System (ADS)
Park, Sang-Sung; Shin, Young-Geun; Jung, Won-Kyo; Ahn, Dong-Kyu; Jang, Dong-Sik
2009-08-01
Because of development of computer and information communication, EDI (Electronic Data Interchange) has been developing. There is OCR (Optical Character Recognition) of Pattern recognition technology for EDI. OCR contributed to changing many manual in the past into automation. But for the more perfect database of document, much manual is needed for excluding unnecessary recognition. To resolve this problem, we propose document form based character recognition method in this study. Proposed method is divided into document form recognition part and character recognition part. Especially, in character recognition, change character into binarization by using SVM algorithm and extract more correct feature value.
Text Mining for Protein Docking
Badal, Varsha D.; Kundrotas, Petras J.; Vakser, Ilya A.
2015-01-01
The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate. PMID:26650466
Robandt, Paul P; Reda, Louis J; Klette, Kevin L
2008-10-01
A fully automated system utilizing a liquid handler and an online solid-phase extraction (SPE) device coupled with liquid chromatography-tandem mass spectrometry (LC-MS-MS) was designed to process, detect, and quantify benzoylecgonine (BZE), meta-hydroxybenzoylecgonine (m-OH BZE), para-hydroxybenzoylecgonine (p-OH BZE), and norbenzoylecgonine (nor-BZE) metabolites in human urine. The method was linear for BZE, m-OH BZE, and p-OH BZE from 1.2 to 10,000 ng/mL with limits of detection (LOD) and quantification (LOQ) of 1.2 ng/mL. Nor-BZE was linear from 5 to 10,000 ng/mL with an LOD and LOQ of 1.2 and 5 ng/mL, respectively. The intrarun precision measured as the coefficient of variation of 10 replicates of a 100 ng/mL control was less than 2.6%, and the interrun precision for 5 replicates of the same control across 8 batches was less than 4.8% for all analytes. No assay interference was noted from controls containing cocaine, cocaethylene, and ecgonine methyl ester. Excellent data concordance (R2 > 0.994) was found for direct comparison of the automated SPE-LC-MS-MS procedure and an existing gas chromatography-MS procedure using 94 human urine samples previously determined to be positive for BZE. The automated specimen handling and SPE procedure, when compared to the traditional extraction schema, eliminates the human factors of specimen handling, processing, extraction, and derivatization, thereby reducing labor costs and rework resulting from batch handling issues, and may reduce the number of fume hoods required in the laboratory.
Frégeau, Chantal J; Lett, C Marc; Fourney, Ron M
2010-10-01
A semi-automated DNA extraction process for casework samples based on the Promega DNA IQ™ system was optimized and validated on TECAN Genesis 150/8 and Freedom EVO robotic liquid handling stations configured with fixed tips and a TECAN TE-Shake™ unit. The use of an orbital shaker during the extraction process promoted efficiency with respect to DNA capture, magnetic bead/DNA complex washes and DNA elution. Validation studies determined the reliability and limitations of this shaker-based process. Reproducibility with regards to DNA yields for the tested robotic workstations proved to be excellent and not significantly different than that offered by the manual phenol/chloroform extraction. DNA extraction of animal:human blood mixtures contaminated with soil demonstrated that a human profile was detectable even in the presence of abundant animal blood. For exhibits containing small amounts of biological material, concordance studies confirmed that DNA yields for this shaker-based extraction process are equivalent or greater to those observed with phenol/chloroform extraction as well as our original validated automated magnetic bead percolation-based extraction process. Our data further supports the increasing use of robotics for the processing of casework samples. Crown Copyright © 2009. Published by Elsevier Ireland Ltd. All rights reserved.
An automated method for depth-dependent crustal anisotropy detection with receiver function
NASA Astrophysics Data System (ADS)
Licciardi, Andrea; Piana Agostinetti, Nicola
2015-04-01
Crustal seismic anisotropy can be generated by a variety of geological factors (e.g. alignment of minerals/cracks, presence of fluids etc...). In the case of transversely isotropic media approximation, information about strength and orientation of the anisotropic symmetry axis (including dip) can be extracted from the analysis of P-to-S conversions by means of teleseismic receiver functions (RF). Classically this has been achieved through probabilistic inversion encoding a forward solver for anisotropic media. This approach strongly relies on apriori choices regarding Earth's crust parameterization and velocity structure, requires an extensive knowledge of the RF method and involves time consuming trial and error steps. We present an automated method for reducing the non-uniqueness in this kind of inversions and for retrieving depth-dependent seismic anisotropy parameters in the crust with a resolution of some hundreds of meters. The method involves a multi-frequency approach (for better absolute Vs determination) and the decomposition of the RF data-set in its azimuthal harmonics (to separate the effects of isotropic and anisotropic component). A first inversion of the isotropic component (Zero-order harmonics) by means of a Reversible jump Markov Chain Monte Carlo (RjMCMC) provides the posterior probability distribution for the position of the velocity jumps at depth, from which information on the number of layers and the S-wave velocity structure below a broadband seismic station can be extracted. This information together with that encoded in the first order harmonic is jointly used in an automated way to: (1) determine the number of anisotropic layers and their approximate position at depth, and (2) narrow the search boundaries for layer thickness and S-wave velocity. Finaly, an inversion is carried out with a Neighbourhood Algorithm (NA), where the free parameters are represented by the anisotropic structure beneath the seismic station. We tested the method against synthetic RF with correlated Gaussian noise to investigate the resolution power for multiple and thin (1-5 km) anisotropic layers in the crust. The algorithm correctly retrieves the true models for the number and the position of the anisotropic layers, their strength and orientation of the anisotropic symmetry axis, although the trend direction is better constrained than the dip angle. The method is then applied to a real data-set and the results compared with previous RF studies.
NASA Astrophysics Data System (ADS)
Yu, Peter; Eyles, Nick; Sookhan, Shane
2015-10-01
Resolving the origin(s) of drumlins and related megaridges in areas of megascale glacial lineations (MSGL) left by paleo-ice sheets is critical to understanding how ancient ice sheets interacted with their sediment beds. MSGL is now linked with fast-flowing ice streams but there is a broad range of erosional and depositional models. Further progress is reliant on constraining fluxes of subglacial sediment at the ice sheet base which in turn is dependent on morphological data such as landform shape and elongation and most importantly landform volume. Past practice in determining shape has employed a broad range of geomorphological methods from strictly visualisation techniques to more complex semi-automated and automated drumlin extraction methods. This paper reviews and builds on currently available visualisation, semi-automated and automated extraction methods and presents a new, Curvature Based Relief Separation (CBRS) technique; for drumlin mapping. This uses curvature analysis to generate a base level from which topography can be normalized and drumlin volume can be derived. This methodology is tested using a high resolution (3 m) LiDAR elevation dataset from the Wadena Drumlin Field, Minnesota, USA, which was constructed by the Wadena Lobe of the Laurentide Ice Sheet ca. 20,000 years ago and which as a whole contains 2000 drumlins across an area of 7500 km2. This analysis demonstrates that CBRS provides an objective and robust procedure for automated drumlin extraction. There is strong agreement with manually selected landforms but the method is also capable of resolving features that were not detectable manually thereby considerably expanding the known population of streamlined landforms. CBRS provides an effective automatic method for visualisation of large areas of the streamlined beds of former ice sheets and for modelling sediment fluxes below ice sheets.
NASA Astrophysics Data System (ADS)
Krappe, Sebastian; Benz, Michaela; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian
2015-03-01
The morphological analysis of bone marrow smears is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually with the use of bright field microscope. This is a time consuming, partly subjective and tedious process. Furthermore, repeated examinations of a slide yield intra- and inter-observer variances. For this reason an automation of morphological bone marrow analysis is pursued. This analysis comprises several steps: image acquisition and smear detection, cell localization and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is depending on the automated cell segmentation and the choice of adequate features extracted from different parts of the cell. In this work we focus on the evaluation of support vector machines (SVMs) and random forests (RFs) for the differentiation of bone marrow cells in 16 different classes, including immature and abnormal cell classes. Data sets of different segmentation quality are used to test the two approaches. Automated solutions for the morphological analysis for bone marrow smears could use such a classifier to pre-classify bone marrow cells and thereby shortening the examination duration.
SIDECACHE: Information access, management and dissemination framework for web services.
Doderer, Mark S; Burkhardt, Cory; Robbins, Kay A
2011-06-14
Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.
Bravo, Dayana; Clari, María Ángeles; Costa, Elisa; Muñoz-Cobo, Beatriz; Solano, Carlos; José Remigia, María; Navarro, David
2011-08-01
Limited data are available on the performance of different automated extraction platforms and commercially available quantitative real-time PCR (QRT-PCR) methods for the quantitation of cytomegalovirus (CMV) DNA in plasma. We compared the performance characteristics of the Abbott mSample preparation system DNA kit on the m24 SP instrument (Abbott), the High Pure viral nucleic acid kit on the COBAS AmpliPrep system (Roche), and the EZ1 Virus 2.0 kit on the BioRobot EZ1 extraction platform (Qiagen) coupled with the Abbott CMV PCR kit, the LightCycler CMV Quant kit (Roche), and the Q-CMV complete kit (Nanogen), for both plasma specimens from allogeneic stem cell transplant (Allo-SCT) recipients (n = 42) and the OptiQuant CMV DNA panel (AcroMetrix). The EZ1 system displayed the highest extraction efficiency over a wide range of CMV plasma DNA loads, followed by the m24 and the AmpliPrep methods. The Nanogen PCR assay yielded higher mean CMV plasma DNA values than the Abbott and the Roche PCR assays, regardless of the platform used for DNA extraction. Overall, the effects of the extraction method and the QRT-PCR used on CMV plasma DNA load measurements were less pronounced for specimens with high CMV DNA content (>10,000 copies/ml). The performance characteristics of the extraction methods and QRT-PCR assays evaluated herein for clinical samples were extensible at cell-based standards from AcroMetrix. In conclusion, different automated systems are not equally efficient for CMV DNA extraction from plasma specimens, and the plasma CMV DNA loads measured by commercially available QRT-PCRs can differ significantly. The above findings should be taken into consideration for the establishment of cutoff values for the initiation or cessation of preemptive antiviral therapies and for the interpretation of data from clinical studies in the Allo-SCT setting.
Object-oriented classification of drumlins from digital elevation models
NASA Astrophysics Data System (ADS)
Saha, Kakoli
Drumlins are common elements of glaciated landscapes which are easily identified by their distinct morphometric characteristics including shape, length/width ratio, elongation ratio, and uniform direction. To date, most researchers have mapped drumlins by tracing contours on maps, or through on-screen digitization directly on top of hillshaded digital elevation models (DEMs). This paper seeks to utilize the unique morphometric characteristics of drumlins and investigates automated extraction of the landforms as objects from DEMs by Definiens Developer software (V.7), using the 30 m United States Geological Survey National Elevation Dataset DEM as input. The Chautauqua drumlin field in Pennsylvania and upstate New York, USA was chosen as a study area. As the study area is huge (approximately covers 2500 sq.km. of area), small test areas were selected for initial testing of the method. Individual polygons representing the drumlins were extracted from the elevation data set by automated recognition, using Definiens' Multiresolution Segmentation tool, followed by rule-based classification. Subsequently parameters such as length, width and length-width ratio, perimeter and area were measured automatically. To test the accuracy of the method, a second base map was produced by manual on-screen digitization of drumlins from topographic maps and the same morphometric parameters were extracted from the mapped landforms using Definiens Developer. Statistical comparison showed a high agreement between the two methods confirming that object-oriented classification for extraction of drumlins can be used for mapping these landforms. The proposed method represents an attempt to solve the problem by providing a generalized rule-set for mass extraction of drumlins. To check that the automated extraction process was next applied to a larger area. Results showed that the proposed method is as successful for the bigger area as it was for the smaller test areas.
Competition for DNA binding sites using Promega DNA IQ™ paramagnetic beads.
Frégeau, Chantal J; De Moors, Anick
2012-09-01
The Promega DNA IQ™ system is easily amenable to automation and has been an integral part of standard operating procedures for many forensic laboratories including those of the Royal Canadian Mounted Police (RCMP) since 2004. Due to some failure to extract DNA from samples that should have produced DNA using our validated automated DNA IQ™-based protocol, the competition for binding sites on the DNA IQ™ magnetic beads was more closely examined. Heme from heavily blooded samples interfered slightly with DNA binding. Increasing the concentration of Proteinase K during lysis of these samples did not enhance DNA recovery. However, diluting the sample lysate following lysis prior to DNA extraction overcame the reduction in DNA yield and preserved portions of the lysates for subsequent manual or automated extraction. Dye/chemicals from black denim lysates competed for binding sites on the DNA IQ™ beads and significantly reduced DNA recovery. Increasing the size or number of black denim cuttings during lysis had a direct adverse effect on DNA yield from various blood volumes. The dilution approach was successful on these samples and permitted the extraction of high DNA yields. Alternatively, shortening the incubation time for cell lysis to 30 min instead of the usual overnight at 56 °C prevented competition from black denim dye/chemicals and increased DNA yields. Crown Copyright © 2011. Published by Elsevier Ireland Ltd. All rights reserved.
A data-driven approach for quality assessment of radiologic interpretations.
Hsu, William; Han, Simon X; Arnold, Corey W; Bui, Alex At; Enzmann, Dieter R
2016-04-01
Given the increasing emphasis on delivering high-quality, cost-efficient healthcare, improved methodologies are needed to measure the accuracy and utility of ordered diagnostic examinations in achieving the appropriate diagnosis. Here, we present a data-driven approach for performing automated quality assessment of radiologic interpretations using other clinical information (e.g., pathology) as a reference standard for individual radiologists, subspecialty sections, imaging modalities, and entire departments. Downstream diagnostic conclusions from the electronic medical record are utilized as "truth" to which upstream diagnoses generated by radiology are compared. The described system automatically extracts and compares patient medical data to characterize concordance between clinical sources. Initial results are presented in the context of breast imaging, matching 18 101 radiologic interpretations with 301 pathology diagnoses and achieving a precision and recall of 84% and 92%, respectively. The presented data-driven method highlights the challenges of integrating multiple data sources and the application of information extraction tools to facilitate healthcare quality improvement. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Meyer, M.T.; Mills, M.S.; Thurman, E.M.
1993-01-01
An automated solid-phase extraction (SPE) method was developed for the pre-concentration of chloroacetanilide and triazine herbicides, and two triazine metabolites from 100-ml water samples. Breakthrough experiments for the C18 SPE cartridge show that the two triazine metabolites are not fully retained and that increasing flow-rate decreases their retention. Standard curve r2 values of 0.998-1.000 for each compound were consistently obtained and a quantitation level of 0.05 ??g/l was achieved for each compound tested. More than 10,000 surface and ground water samples have been analyzed by this method.
Automated process for solvent separation of organic/inorganic substance
Schweighardt, F.K.
1986-07-29
There is described an automated process for the solvent separation of organic/inorganic substances that operates continuously and unattended and eliminates potential errors resulting from subjectivity and the aging of the sample during analysis. In the process, metered amounts of one or more solvents are passed sequentially through a filter containing the sample under the direction of a microprocessor control apparatus. The mixture in the filter is agitated by ultrasonic cavitation for a timed period and the filtrate is collected. The filtrate of each solvent extraction is collected individually and the residue on the filter element is collected to complete the extraction process. 4 figs.
Automated process for solvent separation of organic/inorganic substance
Schweighardt, Frank K.
1986-01-01
There is described an automated process for the solvent separation of organic/inorganic substances that operates continuously and unattended and eliminates potential errors resulting from subjectivity and the aging of the sample during analysis. In the process, metered amounts of one or more solvents are passed sequentially through a filter containing the sample under the direction of a microprocessor control apparatus. The mixture in the filter is agitated by ultrasonic cavitation for a timed period and the filtrate is collected. The filtrate of each solvent extraction is collected individually and the residue on the filter element is collected to complete the extraction process.
Automated apparatus for solvent separation of a coal liquefaction product stream
Schweighardt, Frank K.
1985-01-01
An automated apparatus for the solvent separation of a coal liquefaction product stream that operates continuously and unattended and eliminates potential errors resulting from subjectivity and the aging of the sample during analysis. In use of the apparatus, metered amounts of one or more solvents are passed sequentially through a filter containing the sample under the direction of a microprocessor control means. The mixture in the filter is agitated by means of ultrasonic cavitation for a timed period and the filtrate is collected. The filtrate of each solvent extraction is collected individually and the residue on the filter element is collected to complete the extraction process.
Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L
2012-07-01
Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.
Automation of lidar-based hydrologic feature extraction workflows using GIS
NASA Astrophysics Data System (ADS)
Borlongan, Noel Jerome B.; de la Cruz, Roel M.; Olfindo, Nestor T.; Perez, Anjillyn Mae C.
2016-10-01
With the advent of LiDAR technology, higher resolution datasets become available for use in different remote sensing and GIS applications. One significant application of LiDAR datasets in the Philippines is in resource features extraction. Feature extraction using LiDAR datasets require complex and repetitive workflows which can take a lot of time for researchers through manual execution and supervision. The Development of the Philippine Hydrologic Dataset for Watersheds from LiDAR Surveys (PHD), a project under the Nationwide Detailed Resources Assessment Using LiDAR (Phil-LiDAR 2) program, created a set of scripts, the PHD Toolkit, to automate its processes and workflows necessary for hydrologic features extraction specifically Streams and Drainages, Irrigation Network, and Inland Wetlands, using LiDAR Datasets. These scripts are created in Python and can be added in the ArcGIS® environment as a toolbox. The toolkit is currently being used as an aid for the researchers in hydrologic feature extraction by simplifying the workflows, eliminating human errors when providing the inputs, and providing quick and easy-to-use tools for repetitive tasks. This paper discusses the actual implementation of different workflows developed by Phil-LiDAR 2 Project 4 in Streams, Irrigation Network and Inland Wetlands extraction.
DB4US: A Decision Support System for Laboratory Information Management
Hortas, Maria Luisa; Baena-García, Manuel; Lana-Linati, Jorge; González, Carlos; Redondo, Maximino; Morales-Bueno, Rafael
2012-01-01
Background Until recently, laboratory automation has focused primarily on improving hardware. Future advances are concentrated on intelligent software since laboratories performing clinical diagnostic testing require improved information systems to address their data processing needs. In this paper, we propose DB4US, an application that automates information related to laboratory quality indicators information. Currently, there is a lack of ready-to-use management quality measures. This application addresses this deficiency through the extraction, consolidation, statistical analysis, and visualization of data related to the use of demographics, reagents, and turn-around times. The design and implementation issues, as well as the technologies used for the implementation of this system, are discussed in this paper. Objective To develop a general methodology that integrates the computation of ready-to-use management quality measures and a dashboard to easily analyze the overall performance of a laboratory, as well as automatically detect anomalies or errors. The novelty of our approach lies in the application of integrated web-based dashboards as an information management system in hospital laboratories. Methods We propose a new methodology for laboratory information management based on the extraction, consolidation, statistical analysis, and visualization of data related to demographics, reagents, and turn-around times, offering a dashboard-like user web interface to the laboratory manager. The methodology comprises a unified data warehouse that stores and consolidates multidimensional data from different data sources. The methodology is illustrated through the implementation and validation of DB4US, a novel web application based on this methodology that constructs an interface to obtain ready-to-use indicators, and offers the possibility to drill down from high-level metrics to more detailed summaries. The offered indicators are calculated beforehand so that they are ready to use when the user needs them. The design is based on a set of different parallel processes to precalculate indicators. The application displays information related to tests, requests, samples, and turn-around times. The dashboard is designed to show the set of indicators on a single screen. Results DB4US was deployed for the first time in the Hospital Costa del Sol in 2008. In our evaluation we show the positive impact of this methodology for laboratory professionals, since the use of our application has reduced the time needed for the elaboration of the different statistical indicators and has also provided information that has been used to optimize the usage of laboratory resources by the discovery of anomalies in the indicators. DB4US users benefit from Internet-based communication of results, since this information is available from any computer without having to install any additional software. Conclusions The proposed methodology and the accompanying web application, DB4US, automates the processing of information related to laboratory quality indicators and offers a novel approach for managing laboratory-related information, benefiting from an Internet-based communication mechanism. The application of this methodology has been shown to improve the usage of time, as well as other laboratory resources. PMID:23608745
Chidambaram, Valliammai; Brewster, Philip J.; Jordan, Kristine C.; Hurdle, John F.
2013-01-01
The United States, indeed the world, struggles with a serious obesity epidemic. The costs of this epidemic in terms of healthcare dollar expenditures and human morbidity/mortality are staggering. Surprisingly, clinicians are ill-equipped in general to advise patients on effective, longitudinal weight loss strategies. We argue that one factor hindering clinicians and patients in effective shared decision-making about weight loss is the absence of a metric that can be reasoned about and monitored over time, as clinicians do routinely with, say, serum lipid levels or HgA1C. We propose that a dietary quality measure championed by the USDA and NCI, the HEI-2005/2010, is an ideal metric for this purpose. We describe a new tool, the quality Dietary Information Extraction Tool (qDIET), which is a step toward an automated, self-sustaining process that can link retail grocery purchase data to the appropriate USDA databases to permit the calculation of the HEI-2005/2010. PMID:24551333
Soleymani, Ali; Pennekamp, Frank; Petchey, Owen L.; Weibel, Robert
2015-01-01
Recent advances in tracking technologies such as GPS or video tracking systems describe the movement paths of individuals in unprecedented details and are increasingly used in different fields, including ecology. However, extracting information from raw movement data requires advanced analysis techniques, for instance to infer behaviors expressed during a certain period of the recorded trajectory, or gender or species identity in case data is obtained from remote tracking. In this paper, we address how different movement features affect the ability to automatically classify the species identity, using a dataset of unicellular microbes (i.e., ciliates). Previously, morphological attributes and simple movement metrics, such as speed, were used for classifying ciliate species. Here, we demonstrate that adding advanced movement features, in particular such based on discrete wavelet transform, to morphological features can improve classification. These results may have practical applications in automated monitoring of waste water facilities as well as environmental monitoring of aquatic systems. PMID:26680591
Analysis of ? twinning via automated atomistic post-processing methods
NASA Astrophysics Data System (ADS)
Barrett, Christopher D.
2017-05-01
? twinning is the most prominent and most studied twin mode in hexagonal close-packed materials. Many works have been devoted to describing its nucleation, growth and interactions with other defects. Despite this, gaps and disagreements remain in the literature regarding some fundamental aspects of the twinning process. A rigorous understanding of the twinning process is imperative because without it higher scale models of plasticity cannot accurately capture deformation in important materials such as Mg, Ti, Zr and Zn. Motivated by this necessity, we have studied ? twinning using molecular dynamics, focusing on automated processing techniques which can extract mechanistic information generalisable to continuum scale deformation. This demonstrates for the first time the automatic identification of twinning dislocation lines and Burgers vectors, and the elasto-plastic decomposition of the deformation gradient inside and around a twin embryo. These results confirm predictions of most authors regarding the dislocation-based twin growth process, while contradicting others who have argued that ? twin growth stems from a shuffling process with no dislocation line.
Chidambaram, Valliammai; Brewster, Philip J; Jordan, Kristine C; Hurdle, John F
2013-01-01
The United States, indeed the world, struggles with a serious obesity epidemic. The costs of this epidemic in terms of healthcare dollar expenditures and human morbidity/mortality are staggering. Surprisingly, clinicians are ill-equipped in general to advise patients on effective, longitudinal weight loss strategies. We argue that one factor hindering clinicians and patients in effective shared decision-making about weight loss is the absence of a metric that can be reasoned about and monitored over time, as clinicians do routinely with, say, serum lipid levels or HgA1C. We propose that a dietary quality measure championed by the USDA and NCI, the HEI-2005/2010, is an ideal metric for this purpose. We describe a new tool, the quality Dietary Information Extraction Tool (qDIET), which is a step toward an automated, self-sustaining process that can link retail grocery purchase data to the appropriate USDA databases to permit the calculation of the HEI-2005/2010.
Automatic drawing for traffic marking with MMS LIDAR intensity
NASA Astrophysics Data System (ADS)
Takahashi, G.; Takeda, H.; Shimano, Y.
2014-05-01
Upgrading the database of CYBER JAPAN has been strategically promoted because the "Basic Act on Promotion of Utilization of Geographical Information", was enacted in May 2007. In particular, there is a high demand for road information that comprises a framework in this database. Therefore, road inventory mapping work has to be accurate and eliminate variation caused by individual human operators. Further, the large number of traffic markings that are periodically maintained and possibly changed require an efficient method for updating spatial data. Currently, we apply manual photogrammetry drawing for mapping traffic markings. However, this method is not sufficiently efficient in terms of the required productivity, and data variation can arise from individual operators. In contrast, Mobile Mapping Systems (MMS) and high-density Laser Imaging Detection and Ranging (LIDAR) scanners are rapidly gaining popularity. The aim in this study is to build an efficient method for automatically drawing traffic markings using MMS LIDAR data. The key idea in this method is extracting lines using a Hough transform strategically focused on changes in local reflection intensity along scan lines. However, also note that this method processes every traffic marking. In this paper, we discuss a highly accurate and non-human-operator-dependent method that applies the following steps: (1) Binarizing LIDAR points by intensity and extracting higher intensity points; (2) Generating a Triangulated Irregular Network (TIN) from higher intensity points; (3) Deleting arcs by length and generating outline polygons on the TIN; (4) Generating buffers from the outline polygons; (5) Extracting points from the buffers using the original LIDAR points; (6) Extracting local-intensity-changing points along scan lines using the extracted points; (7) Extracting lines from intensity-changing points through a Hough transform; and (8) Connecting lines to generate automated traffic marking mapping data.
Modular workcells: modern methods for laboratory automation.
Felder, R A
1998-12-01
Laboratory automation is beginning to become an indispensable survival tool for laboratories facing difficult market competition. However, estimates suggest that only 8% of laboratories will be able to afford total laboratory automation systems. Therefore, automation vendors have developed alternative hardware configurations called 'modular automation', to fit the smaller laboratory. Modular automation consists of consolidated analyzers, integrated analyzers, modular workcells, and pre- and post-analytical automation. These terms will be defined in this paper. Using a modular automation model, the automated core laboratory will become a site where laboratory data is evaluated by trained professionals to provide diagnostic information to practising physicians. Modem software information management and process control tools will complement modular hardware. Proper standardization that will allow vendor-independent modular configurations will assure success of this revolutionary new technology.
Solid phase microextraction (SPME) has revolutionized the way samples are extracted, enabling rapid, automated, and solventless extraction of many different sample types, including air, water, soil, and biological samples. As such, SPME is widely used for environmental, food, fo...
Dependency-based long short term memory network for drug-drug interaction extraction.
Wang, Wei; Yang, Xi; Yang, Canqun; Guo, Xiaowei; Zhang, Xiang; Wu, Chengkun
2017-12-28
Drug-drug interaction extraction (DDI) needs assistance from automated methods to address the explosively increasing biomedical texts. In recent years, deep neural network based models have been developed to address such needs and they have made significant progress in relation identification. We propose a dependency-based deep neural network model for DDI extraction. By introducing the dependency-based technique to a bi-directional long short term memory network (Bi-LSTM), we build three channels, namely, Linear channel, DFS channel and BFS channel. All of these channels are constructed with three network layers, including embedding layer, LSTM layer and max pooling layer from bottom up. In the embedding layer, we extract two types of features, one is distance-based feature and another is dependency-based feature. In the LSTM layer, a Bi-LSTM is instituted in each channel to better capture relation information. Then max pooling is used to get optimal features from the entire encoding sequential data. At last, we concatenate the outputs of all channels and then link it to the softmax layer for relation identification. To the best of our knowledge, our model achieves new state-of-the-art performance with the F-score of 72.0% on the DDIExtraction 2013 corpus. Moreover, our approach obtains much higher Recall value compared to the existing methods. The dependency-based Bi-LSTM model can learn effective relation information with less feature engineering in the task of DDI extraction. Besides, the experimental results show that our model excels at balancing the Precision and Recall values.
Data-driven Ontology Development: A Case Study at NASA's Atmospheric Science Data Center
NASA Astrophysics Data System (ADS)
Hertz, J.; Huffer, E.; Kusterer, J.
2012-12-01
Well-founded ontologies are key to enabling transformative semantic technologies and accelerating scientific research. One example is semantically enabled search and discovery, making scientific data accessible and more understandable by accurately modeling a complex domain. The ontology creation process remains a challenge for many anxious to pursue semantic technologies. The key may be that the creation process -- whether formal, community-based, automated or semi-automated -- should encompass not only a foundational core and supplemental resources but also a focus on the purpose or mission the ontology is created to support. Are there tools or processes to de-mystify, assess or enhance the resulting ontology? We suggest that comparison and analysis of a domain-focused ontology can be made using text engineering tools for information extraction, tokenizers, named entity transducers and others. The results are analyzed to ensure the ontology reflects the core purpose of the domain's mission and that the ontology integrates and describes the supporting data in the language of the domain - how the science is analyzed and discussed among all users of the data. Commonalities and relationships among domain resources describing the Clouds and Earth's Radiant Energy (CERES) Bi-Directional Scan (BDS) datasets from NASA's Atmospheric Science Data Center are compared. The domain resources include: a formal ontology created for CERES; scientific works such as papers, conference proceedings and notes; information extracted from the datasets (i.e., header metadata); and BDS scientific documentation (Algorithm Theoretical Basis Documents, collection guides, data quality summaries and others). These resources are analyzed using the open source software General Architecture for Text Engineering, a mature framework for computational tasks involving human language.
Fuzzy-based propagation of prior knowledge to improve large-scale image analysis pipelines
Mikut, Ralf
2017-01-01
Many automatically analyzable scientific questions are well-posed and a variety of information about expected outcomes is available a priori. Although often neglected, this prior knowledge can be systematically exploited to make automated analysis operations sensitive to a desired phenomenon or to evaluate extracted content with respect to this prior knowledge. For instance, the performance of processing operators can be greatly enhanced by a more focused detection strategy and by direct information about the ambiguity inherent in the extracted data. We present a new concept that increases the result quality awareness of image analysis operators by estimating and distributing the degree of uncertainty involved in their output based on prior knowledge. This allows the use of simple processing operators that are suitable for analyzing large-scale spatiotemporal (3D+t) microscopy images without compromising result quality. On the foundation of fuzzy set theory, we transform available prior knowledge into a mathematical representation and extensively use it to enhance the result quality of various processing operators. These concepts are illustrated on a typical bioimage analysis pipeline comprised of seed point detection, segmentation, multiview fusion and tracking. The functionality of the proposed approach is further validated on a comprehensive simulated 3D+t benchmark data set that mimics embryonic development and on large-scale light-sheet microscopy data of a zebrafish embryo. The general concept introduced in this contribution represents a new approach to efficiently exploit prior knowledge to improve the result quality of image analysis pipelines. The generality of the concept makes it applicable to practically any field with processing strategies that are arranged as linear pipelines. The automated analysis of terabyte-scale microscopy data will especially benefit from sophisticated and efficient algorithms that enable a quantitative and fast readout. PMID:29095927
Model-based approach to the detection and classification of mines in sidescan sonar.
Reed, Scott; Petillot, Yvan; Bell, Judith
2004-01-10
This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.
Automated analysis of biological oscillator models using mode decomposition.
Konopka, Tomasz
2011-04-01
Oscillating signals produced by biological systems have shapes, described by their Fourier spectra, that can potentially reveal the mechanisms that generate them. Extracting this information from measured signals is interesting for the validation of theoretical models, discovery and classification of interaction types, and for optimal experiment design. An automated workflow is described for the analysis of oscillating signals. A software package is developed to match signal shapes to hundreds of a priori viable model structures defined by a class of first-order differential equations. The package computes parameter values for each model by exploiting the mode decomposition of oscillating signals and formulating the matching problem in terms of systems of simultaneous polynomial equations. On the basis of the computed parameter values, the software returns a list of models consistent with the data. In validation tests with synthetic datasets, it not only shortlists those model structures used to generate the data but also shows that excellent fits can sometimes be achieved with alternative equations. The listing of all consistent equations is indicative of how further invalidation might be achieved with additional information. When applied to data from a microarray experiment on mice, the procedure finds several candidate model structures to describe interactions related to the circadian rhythm. This shows that experimental data on oscillators is indeed rich in information about gene regulation mechanisms. The software package is available at http://babylone.ulb.ac.be/autoosc/.
2015-01-01
Background Sufficient knowledge of molecular and genetic interactions, which comprise the entire basis of the functioning of living systems, is one of the necessary requirements for successfully answering almost any research question in the field of biology and medicine. To date, more than 24 million scientific papers can be found in PubMed, with many of them containing descriptions of a wide range of biological processes. The analysis of such tremendous amounts of data requires the use of automated text-mining approaches. Although a handful of tools have recently been developed to meet this need, none of them provide error-free extraction of highly detailed information. Results The ANDSystem package was developed for the reconstruction and analysis of molecular genetic networks based on an automated text-mining technique. It provides a detailed description of the various types of interactions between genes, proteins, microRNA's, metabolites, cellular components, pathways and diseases, taking into account the specificity of cell lines and organisms. Although the accuracy of ANDSystem is comparable to other well known text-mining tools, such as Pathway Studio and STRING, it outperforms them in having the ability to identify an increased number of interaction types. Conclusion The use of ANDSystem, in combination with Pathway Studio and STRING, can improve the quality of the automated reconstruction of molecular and genetic networks. ANDSystem should provide a useful tool for researchers working in a number of different fields, including biology, biotechnology, pharmacology and medicine. PMID:25881313
NASA Astrophysics Data System (ADS)
Neubert, A.; Fripp, J.; Engstrom, C.; Schwarz, R.; Lauer, L.; Salvado, O.; Crozier, S.
2012-12-01
Recent advances in high resolution magnetic resonance (MR) imaging of the spine provide a basis for the automated assessment of intervertebral disc (IVD) and vertebral body (VB) anatomy. High resolution three-dimensional (3D) morphological information contained in these images may be useful for early detection and monitoring of common spine disorders, such as disc degeneration. This work proposes an automated approach to extract the 3D segmentations of lumbar and thoracic IVDs and VBs from MR images using statistical shape analysis and registration of grey level intensity profiles. The algorithm was validated on a dataset of volumetric scans of the thoracolumbar spine of asymptomatic volunteers obtained on a 3T scanner using the relatively new 3D T2-weighted SPACE pulse sequence. Manual segmentations and expert radiological findings of early signs of disc degeneration were used in the validation. There was good agreement between manual and automated segmentation of the IVD and VB volumes with the mean Dice scores of 0.89 ± 0.04 and 0.91 ± 0.02 and mean absolute surface distances of 0.55 ± 0.18 mm and 0.67 ± 0.17 mm respectively. The method compares favourably to existing 3D MR segmentation techniques for VBs. This is the first time IVDs have been automatically segmented from 3D volumetric scans and shape parameters obtained were used in preliminary analyses to accurately classify (100% sensitivity, 98.3% specificity) disc abnormalities associated with early degenerative changes.
Chemical name extraction based on automatic training data generation and rich feature set.
Yan, Su; Spangler, W Scott; Chen, Ying
2013-01-01
The automation of extracting chemical names from text has significant value to biomedical and life science research. A major barrier in this task is the difficulty of getting a sizable and good quality data to train a reliable entity extraction model. Another difficulty is the selection of informative features of chemical names, since comprehensive domain knowledge on chemistry nomenclature is required. Leveraging random text generation techniques, we explore the idea of automatically creating training sets for the task of chemical name extraction. Assuming the availability of an incomplete list of chemical names, called a dictionary, we are able to generate well-controlled, random, yet realistic chemical-like training documents. We statistically analyze the construction of chemical names based on the incomplete dictionary, and propose a series of new features, without relying on any domain knowledge. Compared to state-of-the-art models learned from manually labeled data and domain knowledge, our solution shows better or comparable results in annotating real-world data with less human effort. Moreover, we report an interesting observation about the language for chemical names. That is, both the structural and semantic components of chemical names follow a Zipfian distribution, which resembles many natural languages.
Personal authentication using hand vein triangulation and knuckle shape.
Kumar, Ajay; Prathyusha, K Venkata
2009-09-01
This paper presents a new approach to authenticate individuals using triangulation of hand vein images and simultaneous extraction of knuckle shape information. The proposed method is fully automated and employs palm dorsal hand vein images acquired from the low-cost, near infrared, contactless imaging. The knuckle tips are used as key points for the image normalization and extraction of region of interest. The matching scores are generated in two parallel stages: (i) hierarchical matching score from the four topologies of triangulation in the binarized vein structures and (ii) from the geometrical features consisting of knuckle point perimeter distances in the acquired images. The weighted score level combination from these two matching scores are used to authenticate the individuals. The achieved experimental results from the proposed system using contactless palm dorsal-hand vein images are promising (equal error rate of 1.14%) and suggest more user friendly alternative for user identification.
Drop-on-Demand Single Cell Isolation and Total RNA Analysis
Moon, Sangjun; Kim, Yun-Gon; Dong, Lingsheng; Lombardi, Michael; Haeggstrom, Edward; Jensen, Roderick V.; Hsiao, Li-Li; Demirci, Utkan
2011-01-01
Technologies that rapidly isolate viable single cells from heterogeneous solutions have significantly contributed to the field of medical genomics. Challenges remain both to enable efficient extraction, isolation and patterning of single cells from heterogeneous solutions as well as to keep them alive during the process due to a limited degree of control over single cell manipulation. Here, we present a microdroplet based method to isolate and pattern single cells from heterogeneous cell suspensions (10% target cell mixture), preserve viability of the extracted cells (97.0±0.8%), and obtain genomic information from isolated cells compared to the non-patterned controls. The cell encapsulation process is both experimentally and theoretically analyzed. Using the isolated cells, we identified 11 stem cell markers among 1000 genes and compare to the controls. This automated platform enabling high-throughput cell manipulation for subsequent genomic analysis employs fewer handling steps compared to existing methods. PMID:21412416
Text Mining in Biomedical Domain with Emphasis on Document Clustering
2017-01-01
Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048
NASA Astrophysics Data System (ADS)
Rocha, José Celso; Passalia, Felipe José; Matos, Felipe Delestro; Takahashi, Maria Beatriz; Maserati, Marc Peter, Jr.; Alves, Mayra Fernanda; de Almeida, Tamie Guibu; Cardoso, Bruna Lopes; Basso, Andrea Cristina; Nogueira, Marcelo Fábio Gouveia
2017-12-01
There is currently no objective, real-time and non-invasive method for evaluating the quality of mammalian embryos. In this study, we processed images of in vitro produced bovine blastocysts to obtain a deeper comprehension of the embryonic morphological aspects that are related to the standard evaluation of blastocysts. Information was extracted from 482 digital images of blastocysts. The resulting imaging data were individually evaluated by three experienced embryologists who graded their quality. To avoid evaluation bias, each image was related to the modal value of the evaluations. Automated image processing produced 36 quantitative variables for each image. The images, the modal and individual quality grades, and the variables extracted could potentially be used in the development of artificial intelligence techniques (e.g., evolutionary algorithms and artificial neural networks), multivariate modelling and the study of defined structures of the whole blastocyst.
Machine learning in soil classification.
Bhattacharya, B; Solomatine, D P
2006-03-01
In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained.
NASA Technical Reports Server (NTRS)
Kaupp, V. H.; Macdonald, H. C.; Waite, W. P.; Stiles, J. A.; Frost, F. S.; Shanmugam, K. S.; Smith, S. A.; Narayanan, V.; Holtzman, J. C. (Principal Investigator)
1982-01-01
Computer-generated radar simulations and mathematical geologic terrain models were used to establish the optimum radar sensor operating parameters for geologic research. An initial set of mathematical geologic terrain models was created for three basic landforms and families of simulated radar images were prepared from these models for numerous interacting sensor, platform, and terrain variables. The tradeoffs between the various sensor parameters and the quantity and quality of the extractable geologic data were investigated as well as the development of automated techniques of digital SAR image analysis. Initial work on a texture analysis of SEASAT SAR imagery is reported. Computer-generated radar simulations are shown for combinations of two geologic models and three SAR angles of incidence.
Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection
NASA Astrophysics Data System (ADS)
Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T.; Carr, Christopher E.
2017-08-01
Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry-dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a "universal" nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars.
Nucleic Acid Extraction from Synthetic Mars Analog Soils for in situ Life Detection.
Mojarro, Angel; Ruvkun, Gary; Zuber, Maria T; Carr, Christopher E
2017-08-01
Biological informational polymers such as nucleic acids have the potential to provide unambiguous evidence of life beyond Earth. To this end, we are developing an automated in situ life-detection instrument that integrates nucleic acid extraction and nanopore sequencing: the Search for Extra-Terrestrial Genomes (SETG) instrument. Our goal is to isolate and determine the sequence of nucleic acids from extant or preserved life on Mars, if, for example, there is common ancestry to life on Mars and Earth. As is true of metagenomic analysis of terrestrial environmental samples, the SETG instrument must isolate nucleic acids from crude samples and then determine the DNA sequence of the unknown nucleic acids. Our initial DNA extraction experiments resulted in low to undetectable amounts of DNA due to soil chemistry-dependent soil-DNA interactions, namely adsorption to mineral surfaces, binding to divalent/trivalent cations, destruction by iron redox cycling, and acidic conditions. Subsequently, we developed soil-specific extraction protocols that increase DNA yields through a combination of desalting, utilization of competitive binders, and promotion of anaerobic conditions. Our results suggest that a combination of desalting and utilizing competitive binders may establish a "universal" nucleic acid extraction protocol suitable for analyzing samples from diverse soils on Mars. Key Words: Life-detection instruments-Nucleic acids-Mars-Panspermia. Astrobiology 17, 747-760.
Identification of Cichlid Fishes from Lake Malawi Using Computer Vision
Joo, Deokjin; Kwan, Ye-seul; Song, Jongwoo; Pinho, Catarina; Hey, Jody; Won, Yong-Jin
2013-01-01
Background The explosively radiating evolution of cichlid fishes of Lake Malawi has yielded an amazing number of haplochromine species estimated as many as 500 to 800 with a surprising degree of diversity not only in color and stripe pattern but also in the shape of jaw and body among them. As these morphological diversities have been a central subject of adaptive speciation and taxonomic classification, such high diversity could serve as a foundation for automation of species identification of cichlids. Methodology/Principal Finding Here we demonstrate a method for automatic classification of the Lake Malawi cichlids based on computer vision and geometric morphometrics. For this end we developed a pipeline that integrates multiple image processing tools to automatically extract informative features of color and stripe patterns from a large set of photographic images of wild cichlids. The extracted information was evaluated by statistical classifiers Support Vector Machine and Random Forests. Both classifiers performed better when body shape information was added to the feature of color and stripe. Besides the coloration and stripe pattern, body shape variables boosted the accuracy of classification by about 10%. The programs were able to classify 594 live cichlid individuals belonging to 12 different classes (species and sexes) with an average accuracy of 78%, contrasting to a mere 42% success rate by human eyes. The variables that contributed most to the accuracy were body height and the hue of the most frequent color. Conclusions Computer vision showed a notable performance in extracting information from the color and stripe patterns of Lake Malawi cichlids although the information was not enough for errorless species identification. Our results indicate that there appears an unavoidable difficulty in automatic species identification of cichlid fishes, which may arise from short divergence times and gene flow between closely related species. PMID:24204918
Automated extraction of clinical traits of multiple sclerosis in electronic medical records
Davis, Mary F; Sriram, Subramaniam; Bush, William S; Denny, Joshua C; Haines, Jonathan L
2013-01-01
Objectives The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course. Materials and methods We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers. Results We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%. Discussion and conclusion This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability. PMID:24148554
A Case Study of Reverse Engineering Integrated in an Automated Design Process
NASA Astrophysics Data System (ADS)
Pescaru, R.; Kyratsis, P.; Oancea, G.
2016-11-01
This paper presents a design methodology which automates the generation of curves extracted from the point clouds that have been obtained by digitizing the physical objects. The methodology is described on a product belonging to the industry of consumables, respectively a footwear type product that has a complex shape with many curves. The final result is the automated generation of wrapping curves, surfaces and solids according to the characteristics of the customer's foot, and to the preferences for the chosen model, which leads to the development of customized products.
Reaction schemes visualized in network form: the syntheses of strychnine as an example.
Proudfoot, John R
2013-05-24
Representation of synthesis sequences in a network form provides an effective method for the comparison of multiple reaction schemes and an opportunity to emphasize features such as reaction scale that are often relegated to experimental sections. An example of data formatting that allows construction of network maps in Cytoscape is presented, along with maps that illustrate the comparison of multiple reaction sequences, comparison of scaffold changes within sequences, and consolidation to highlight common key intermediates used across sequences. The 17 different synthetic routes reported for strychnine are used as an example basis set. The reaction maps presented required a significant data extraction and curation, and a standardized tabular format for reporting reaction information, if applied in a consistent way, could allow the automated combination of reaction information across different sources.
Wavelet Analysis for Wind Fields Estimation
Leite, Gladeston C.; Ushizima, Daniela M.; Medeiros, Fátima N. S.; de Lima, Gilson G.
2010-01-01
Wind field analysis from synthetic aperture radar images allows the estimation of wind direction and speed based on image descriptors. In this paper, we propose a framework to automate wind direction retrieval based on wavelet decomposition associated with spectral processing. We extend existing undecimated wavelet transform approaches, by including à trous with B3 spline scaling function, in addition to other wavelet bases as Gabor and Mexican-hat. The purpose is to extract more reliable directional information, when wind speed values range from 5 to 10 ms−1. Using C-band empirical models, associated with the estimated directional information, we calculate local wind speed values and compare our results with QuikSCAT scatterometer data. The proposed approach has potential application in the evaluation of oil spills and wind farms. PMID:22219699
Natural Language Processing Methods and Systems for Biomedical Ontology Learning
Liu, Kaihong; Hogan, William R.; Crowley, Rebecca S.
2010-01-01
While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they must achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships as well as difficulty in updating the ontology as knowledge changes. Methodologies developed in the fields of natural language processing, information extraction, information retrieval and machine learning provide techniques for automating the enrichment of an ontology from free-text documents. In this article, we review existing methodologies and developed systems, and discuss how existing methods can benefit the development of biomedical ontologies. PMID:20647054
Automated choroidal segmentation method in human eye with 1050nm optical coherence tomography
NASA Astrophysics Data System (ADS)
Liu, Cindy; Wang, Ruikang K.
2014-02-01
Choroidal thickness (ChT), defined as the distance between the retinal pigment epithelium (RPE) and the choroid-sclera interface (CSI), is highly correlated with various ocular disorders like high myopia, diabetic retinopathy, and central serous chorioretinopathy. Long wavelength Optical Coherence Tomography (OCT) has the ability to penetrate deep to the CSI, making the measurement of the ChT possible. The ability to accurately segment the CSI and RPE is important in extracting clinical information. However, automated CSI segmentation is challenging due to the weak boundary in the lower choroid and inconsistent texture with varied blood vessels. We propose a K-means clustering based automated algorithm, which is effective in segmenting the CSI and RPE. The performance of the method was evaluated using 531 frames from 4 normal subjects. The RPE and CSI segmentation time was about 0.3 seconds per frame, and the average time was around 0.5 seconds per frame with correction among frames, which is faster than reported algorithms. The results from the proposed method are consistent with the manual segmentation results. Further investigation includes the optimization of the algorithm to cover more OCT images captured from patients and the increase of the processing speed and robustness of the segmentation method.
Weber, Emanuel; Pinkse, Martijn W. H.; Bener-Aksam, Eda; Vellekoop, Michael J.; Verhaert, Peter D. E. M.
2012-01-01
We present a fully automated setup for performing in-line mass spectrometry (MS) analysis of conditioned media in cell cultures, in particular focusing on the peptides therein. The goal is to assess peptides secreted by cells in different culture conditions. The developed system is compatible with MS as analytical technique, as this is one of the most powerful analysis methods for peptide detection and identification. Proof of concept was achieved using the well-known mating-factor signaling in baker's yeast, Saccharomyces cerevisiae. Our concept system holds 1 mL of cell culture medium and allows maintaining a yeast culture for, at least, 40 hours with continuous supernatant extraction (and medium replenishing). The device's small dimensions result in reduced costs for reagents and open perspectives towards full integration on-chip. Experimental data that can be obtained are time-resolved peptide profiles in a yeast culture, including information about the appearance of mating-factor-related peptides. We emphasize that the system operates without any manual intervention or pipetting steps, which allows for an improved overall sensitivity compared to non-automated alternatives. MS data confirmed previously reported aspects of the physiology of the yeast-mating process. Moreover, matingfactor breakdown products (as well as evidence for a potentially responsible protease) were found. PMID:23091722
Automated diagnosis of autism: in search of a mathematical marker.
Bhat, Shreya; Acharya, U Rajendra; Adeli, Hojjat; Bairy, G Muralidhar; Adeli, Amir
2014-01-01
Autism is a type of neurodevelopmental disorder affecting the memory, behavior, emotion, learning ability, and communication of an individual. An early detection of the abnormality, due to irregular processing in the brain, can be achieved using electroencephalograms (EEG). The variations in the EEG signals cannot be deciphered by mere visual inspection. Computer-aided diagnostic tools can be used to recognize the subtle and invisible information present in the irregular EEG pattern and diagnose autism. This paper presents a state-of-the-art review of automated EEG-based diagnosis of autism. Various time domain, frequency domain, time-frequency domain, and nonlinear dynamics for the analysis of autistic EEG signals are described briefly. A focus of the review is the use of nonlinear dynamics and chaos theory to discover the mathematical biomarkers for the diagnosis of the autism analogous to biological markers. A combination of the time-frequency and nonlinear dynamic analysis is the most effective approach to characterize the nonstationary and chaotic physiological signals for the automated EEG-based diagnosis of autism spectrum disorder (ASD). The features extracted using these nonlinear methods can be used as mathematical markers to detect the early stage of autism and aid the clinicians in their diagnosis. This will expedite the administration of appropriate therapies to treat the disorder.
Automatic mapping of the base of aquifer — A case study from Morrill, Nebraska
Gulbrandsen, Mats Lundh; Ball, Lyndsay B.; Minsley, Burke J.; Hansen, Thomas Mejer
2017-01-01
When a geologist sets up a geologic model, various types of disparate information may be available, such as exposures, boreholes, and (or) geophysical data. In recent years, the amount of geophysical data available has been increasing, a trend that is only expected to continue. It is nontrivial (and often, in practice, impossible) for the geologist to take all the details of the geophysical data into account when setting up a geologic model. We have developed an approach that allows for the objective quantification of information from geophysical data and borehole observations in a way that is easy to integrate in the geologic modeling process. This will allow the geologist to make a geologic interpretation that is consistent with the geophysical information at hand. We have determined that automated interpretation of geologic layer boundaries using information from boreholes and geophysical data alone can provide a good geologic layer model, even before manual interpretation has begun. The workflow is implemented on a set of boreholes and airborne electromagnetic (AEM) data from Morrill, Nebraska. From the borehole logs, information about the depth to the base of aquifer (BOA) is extracted and used together with the AEM data to map a surface that represents this geologic contact. Finally, a comparison between our automated approach and a previous manual mapping of the BOA in the region validates the quality of the proposed method and suggests that this workflow will allow a much faster and objective geologic modeling process that is consistent with the available data.
Ni, Yizhao; Lingren, Todd; Hall, Eric S; Leonard, Matthew; Melton, Kristin; Kirkendall, Eric S
2018-05-01
Timely identification of medication administration errors (MAEs) promises great benefits for mitigating medication errors and associated harm. Despite previous efforts utilizing computerized methods to monitor medication errors, sustaining effective and accurate detection of MAEs remains challenging. In this study, we developed a real-time MAE detection system and evaluated its performance prior to system integration into institutional workflows. Our prospective observational study included automated MAE detection of 10 high-risk medications and fluids for patients admitted to the neonatal intensive care unit at Cincinnati Children's Hospital Medical Center during a 4-month period. The automated system extracted real-time medication use information from the institutional electronic health records and identified MAEs using logic-based rules and natural language processing techniques. The MAE summary was delivered via a real-time messaging platform to promote reduction of patient exposure to potential harm. System performance was validated using a physician-generated gold standard of MAE events, and results were compared with those of current practice (incident reporting and trigger tools). Physicians identified 116 MAEs from 10 104 medication administrations during the study period. Compared to current practice, the sensitivity with automated MAE detection was improved significantly from 4.3% to 85.3% (P = .009), with a positive predictive value of 78.0%. Furthermore, the system showed potential to reduce patient exposure to harm, from 256 min to 35 min (P < .001). The automated system demonstrated improved capacity for identifying MAEs while guarding against alert fatigue. It also showed promise for reducing patient exposure to potential harm following MAE events.
Automated Design Tools for Integrated Mixed-Signal Microsystems (NeoCAD)
2005-02-01
method, Model Order Reduction (MOR) tools, system-level, mixed-signal circuit synthesis and optimization tools, and parsitic extraction tools. A unique...Mission Area: Command and Control mixed signal circuit simulation parasitic extraction time-domain simulation IC design flow model order reduction... Extraction 1.2 Overall Program Milestones CHAPTER 2 FAST TIME DOMAIN MIXED-SIGNAL CIRCUIT SIMULATION 2.1 HAARSPICE Algorithms 2.1.1 Mathematical Background
Holographic Labeling And Reading Machine For Authentication And Security Appications
Weber, David C.; Trolinger, James D.
1999-07-06
A holographic security label and automated reading machine for marking and subsequently authenticating any object such as an identification badge, a pass, a ticket, a manufactured part, or a package is described. The security label is extremely difficult to copy or even to read by unauthorized persons. The system comprises a holographic security label that has been created with a coded reference wave, whose specification can be kept secret. The label contains information that can be extracted only with the coded reference wave, which is derived from a holographic key, which restricts access of the information to only the possessor of the key. A reading machine accesses the information contained in the label and compares it with data stored in the machine through the application of a joint transform correlator, which is also equipped with a reference hologram that adds additional security to the procedure.
NASA Astrophysics Data System (ADS)
Molinari, Filippo; Acharya, Rajendra; Zeng, Guang; Suri, Jasjit S.
2011-03-01
The carotid intima-media thickness (IMT) is the most used marker for the progression of atherosclerosis and onset of the cardiovascular diseases. Computer-aided measurements improve accuracy, but usually require user interaction. In this paper we characterized a new and completely automated technique for carotid segmentation and IMT measurement based on the merits of two previously developed techniques. We used an integrated approach of intelligent image feature extraction and line fitting for automatically locating the carotid artery in the image frame, followed by wall interfaces extraction based on Gaussian edge operator. We called our system - CARES. We validated the CARES on a multi-institutional database of 300 carotid ultrasound images. IMT measurement bias was 0.032 +/- 0.141 mm, better than other automated techniques and comparable to that of user-driven methodologies. Our novel approach of CARES processed 96% of the images leading to the figure of merit to be 95.7%. CARES ensured complete automation and high accuracy in IMT measurement; hence it could be a suitable clinical tool for processing of large datasets in multicenter studies involving atherosclerosis.pre-
Cest Analysis: Automated Change Detection from Very-High Remote Sensing Images
NASA Astrophysics Data System (ADS)
Ehlers, M.; Klonus, S.; Jarmer, T.; Sofina, N.; Michel, U.; Reinartz, P.; Sirmacek, B.
2012-08-01
A fast detection, visualization and assessment of change in areas of crisis or catastrophes are important requirements for coordination and planning of help. Through the availability of new satellites and/or airborne sensors with very high spatial resolutions (e.g., WorldView, GeoEye) new remote sensing data are available for a better detection, delineation and visualization of change. For automated change detection, a large number of algorithms has been proposed and developed. From previous studies, however, it is evident that to-date no single algorithm has the potential for being a reliable change detector for all possible scenarios. This paper introduces the Combined Edge Segment Texture (CEST) analysis, a decision-tree based cooperative suite of algorithms for automated change detection that is especially designed for the generation of new satellites with very high spatial resolution. The method incorporates frequency based filtering, texture analysis, and image segmentation techniques. For the frequency analysis, different band pass filters can be applied to identify the relevant frequency information for change detection. After transforming the multitemporal images via a fast Fourier transform (FFT) and applying the most suitable band pass filter, different methods are available to extract changed structures: differencing and correlation in the frequency domain and correlation and edge detection in the spatial domain. Best results are obtained using edge extraction. For the texture analysis, different 'Haralick' parameters can be calculated (e.g., energy, correlation, contrast, inverse distance moment) with 'energy' so far providing the most accurate results. These algorithms are combined with a prior segmentation of the image data as well as with morphological operations for a final binary change result. A rule-based combination (CEST) of the change algorithms is applied to calculate the probability of change for a particular location. CEST was tested with high-resolution satellite images of the crisis areas of Darfur (Sudan). CEST results are compared with a number of standard algorithms for automated change detection such as image difference, image ratioe, principal component analysis, delta cue technique and post classification change detection. The new combined method shows superior results averaging between 45% and 15% improvement in accuracy.
2017-03-01
Government Accountability Office Highlights of GAO-17-322, a report to congressional committees March 2017 DOD MAJOR AUTOMATED INFORMATION ...DOD MAJOR AUTOMATED INFORMATION SYSTEMS Improvements Can Be Made in Applying Leading Practices for Managing Risk and...Testing Report to Congressional Committees March 2017 GAO-17-322 United States Government Accountability Office United States
Epileptic seizure detection in EEG signal with GModPCA and support vector machine.
Jaiswal, Abeg Kumar; Banka, Haider
2017-01-01
Epilepsy is one of the most common neurological disorders caused by recurrent seizures. Electroencephalograms (EEGs) record neural activity and can detect epilepsy. Visual inspection of an EEG signal for epileptic seizure detection is a time-consuming process and may lead to human error; therefore, recently, a number of automated seizure detection frameworks were proposed to replace these traditional methods. Feature extraction and classification are two important steps in these procedures. Feature extraction focuses on finding the informative features that could be used for classification and correct decision-making. Therefore, proposing effective feature extraction techniques for seizure detection is of great significance. Principal Component Analysis (PCA) is a dimensionality reduction technique used in different fields of pattern recognition including EEG signal classification. Global modular PCA (GModPCA) is a variation of PCA. In this paper, an effective framework with GModPCA and Support Vector Machine (SVM) is presented for epileptic seizure detection in EEG signals. The feature extraction is performed with GModPCA, whereas SVM trained with radial basis function kernel performed the classification between seizure and nonseizure EEG signals. Seven different experimental cases were conducted on the benchmark epilepsy EEG dataset. The system performance was evaluated using 10-fold cross-validation. In addition, we prove analytically that GModPCA has less time and space complexities as compared to PCA. The experimental results show that EEG signals have strong inter-sub-pattern correlations. GModPCA and SVM have been able to achieve 100% accuracy for the classification between normal and epileptic signals. Along with this, seven different experimental cases were tested. The classification results of the proposed approach were better than were compared the results of some of the existing methods proposed in literature. It is also found that the time and space complexities of GModPCA are less as compared to PCA. This study suggests that GModPCA and SVM could be used for automated epileptic seizure detection in EEG signal.
NASA Astrophysics Data System (ADS)
Okyay, U.; Glennie, C. L.; Khan, S.
2017-12-01
Owing to the advent of terrestrial laser scanners (TLS), high-density point cloud data has become increasingly available to the geoscience research community. Research groups have started producing their own point clouds for various applications, gradually shifting their emphasis from obtaining the data towards extracting more and meaningful information from the point clouds. Extracting fracture properties from three-dimensional data in a (semi-)automated manner has been an active area of research in geosciences. Several studies have developed various processing algorithms for extracting only planar surfaces. In comparison, (semi-)automated identification of fracture traces at the outcrop scale, which could be used for mapping fracture distribution have not been investigated frequently. Understanding the spatial distribution and configuration of natural fractures is of particular importance, as they directly influence fluid-flow through the host rock. Surface roughness, typically defined as the deviation of a natural surface from a reference datum, has become an important metric in geoscience research, especially with the increasing density and accuracy of point clouds. In the study presented herein, a surface roughness model was employed to identify fracture traces and their distribution on an ophiolite outcrop in Oman. Surface roughness calculations were performed using orthogonal distance regression over various grid intervals. The results demonstrated that surface roughness could identify outcrop-scale fracture traces from which fracture distribution and density maps can be generated. However, considering outcrop conditions and properties and the purpose of the application, the definition of an adequate grid interval for surface roughness model and selection of threshold values for distribution maps are not straightforward and require user intervention and interpretation.
Automated solid-phase extraction workstations combined with quantitative bioanalytical LC/MS.
Huang, N H; Kagel, J R; Rossi, D T
1999-03-01
An automated solid-phase extraction workstation was used to develop, characterize and validate an LC/MS/MS method for quantifying a novel lipid-regulating drug in dog plasma. Method development was facilitated by workstation functions that allowed wash solvents of varying organic composition to be mixed and tested automatically. Precision estimates for this approach were within 9.8% relative standard deviation (RSD) across the calibration range. Accuracy for replicate determinations of quality controls was between -7.2 and +6.2% relative error (RE) over 5-1,000 ng/ml(-1). Recoveries were evaluated for a wide variety of wash solvents, elution solvents and sorbents. Optimized recoveries were generally > 95%. A sample throughput benchmark for the method was approximately equal 8 min per sample. Because of parallel sample processing, 100 samples were extracted in less than 120 min. The approach has proven useful for use with LC/MS/MS, using a multiple reaction monitoring (MRM) approach.
Automated Solar Flare Detection and Feature Extraction in High-Resolution and Full-Disk Hα Images
NASA Astrophysics Data System (ADS)
Yang, Meng; Tian, Yu; Liu, Yangyi; Rao, Changhui
2018-05-01
In this article, an automated solar flare detection method applied to both full-disk and local high-resolution Hα images is proposed. An adaptive gray threshold and an area threshold are used to segment the flare region. Features of each detected flare event are extracted, e.g. the start, peak, and end time, the importance class, and the brightness class. Experimental results have verified that the proposed method can obtain more stable and accurate segmentation results than previous works on full-disk images from Big Bear Solar Observatory (BBSO) and Kanzelhöhe Observatory for Solar and Environmental Research (KSO), and satisfying segmentation results on high-resolution images from the Goode Solar Telescope (GST). Moreover, the extracted flare features correlate well with the data given by KSO. The method may be able to implement a more complicated statistical analysis of Hα solar flares.
Biomorphic networks: approach to invariant feature extraction and segmentation for ATR
NASA Astrophysics Data System (ADS)
Baek, Andrew; Farhat, Nabil H.
1998-10-01
Invariant features in two dimensional binary images are extracted in a single layer network of locally coupled spiking (pulsating) model neurons with prescribed synapto-dendritic response. The feature vector for an image is represented as invariant structure in the aggregate histogram of interspike intervals obtained by computing time intervals between successive spikes produced from each neuron over a given period of time and combining such intervals from all neurons in the network into a histogram. Simulation results show that the feature vectors are more pattern-specific and invariant under translation, rotation, and change in scale or intensity than achieved in earlier work. We also describe an application of such networks to segmentation of line (edge-enhanced or silhouette) images. The biomorphic spiking network's capabilities in segmentation and invariant feature extraction may prove to be, when they are combined, valuable in Automated Target Recognition (ATR) and other automated object recognition systems.
Singha, Suman; Ressel, Rudolf
2016-11-15
Use of polarimetric SAR data for offshore pollution monitoring is relatively new and shows great potential for operational offshore platform monitoring. This paper describes the development of an automated oil spill detection chain for operational purposes based on C-band (RADARSAT-2) and X-band (TerraSAR-X) fully polarimetric images, wherein we use polarimetric features to characterize oil spills and look-alikes. Numbers of near coincident TerraSAR-X and RADARSAT-2 images have been acquired over offshore platforms. Ten polarimetric feature parameters were extracted from different types of oil and 'look-alike' spots and divided into training and validation dataset. Extracted features were then used to develop a pixel based Artificial Neural Network classifier. Mutual information contents among extracted features were assessed and feature parameters were ranked according to their ability to discriminate between oil spill and look-alike spots. Polarimetric features such as Scattering Diversity, Surface Scattering Fraction and Span proved to be most suitable for operational services. Copyright © 2016 Elsevier Ltd. All rights reserved.
CAMERA: An integrated strategy for compound spectra extraction and annotation of LC/MS data sets
Kuhl, Carsten; Tautenhahn, Ralf; Böttcher, Christoph; Larson, Tony R.; Neumann, Steffen
2013-01-01
Liquid chromatography coupled to mass spectrometry is routinely used for metabolomics experiments. In contrast to the fairly routine and automated data acquisition steps, subsequent compound annotation and identification require extensive manual analysis and thus form a major bottle neck in data interpretation. Here we present CAMERA, a Bioconductor package integrating algorithms to extract compound spectra, annotate isotope and adduct peaks, and propose the accurate compound mass even in highly complex data. To evaluate the algorithms, we compared the annotation of CAMERA against a manually defined annotation for a mixture of known compounds spiked into a complex matrix at different concentrations. CAMERA successfully extracted accurate masses for 89.7% and 90.3% of the annotatable compounds in positive and negative ion mode, respectively. Furthermore, we present a novel annotation approach that combines spectral information of data acquired in opposite ion modes to further improve the annotation rate. We demonstrate the utility of CAMERA in two different, easily adoptable plant metabolomics experiments, where the application of CAMERA drastically reduced the amount of manual analysis. PMID:22111785
The research and development of information services within the USSR, reported at the 3rd All-Union Conference on information retrieval systems and automated processing of scientific and technical information, is discussed.
Extraction of the number of peroxisomes in yeast cells by automated image analysis.
Niemistö, Antti; Selinummi, Jyrki; Saleem, Ramsey; Shmulevich, Ilya; Aitchison, John; Yli-Harja, Olli
2006-01-01
An automated image analysis method for extracting the number of peroxisomes in yeast cells is presented. Two images of the cell population are required for the method: a bright field microscope image from which the yeast cells are detected and the respective fluorescent image from which the number of peroxisomes in each cell is found. The segmentation of the cells is based on clustering the local mean-variance space. The watershed transformation is thereafter employed to separate cells that are clustered together. The peroxisomes are detected by thresholding the fluorescent image. The method is tested with several images of a budding yeast Saccharomyces cerevisiae population, and the results are compared with manually obtained results.
Methods for automatically analyzing humpback song units.
Rickwood, Peter; Taylor, Andrew
2008-03-01
This paper presents mathematical techniques for automatically extracting and analyzing bioacoustic signals. Automatic techniques are described for isolation of target signals from background noise, extraction of features from target signals and unsupervised classification (clustering) of the target signals based on these features. The only user-provided inputs, other than raw sound, is an initial set of signal processing and control parameters. Of particular note is that the number of signal categories is determined automatically. The techniques, applied to hydrophone recordings of humpback whales (Megaptera novaeangliae), produce promising initial results, suggesting that they may be of use in automated analysis of not only humpbacks, but possibly also in other bioacoustic settings where automated analysis is desirable.
Automated Extraction of Secondary Flow Features
NASA Technical Reports Server (NTRS)
Dorney, Suzanne M.; Haimes, Robert
2005-01-01
The use of Computational Fluid Dynamics (CFD) has become standard practice in the design and development of the major components used for air and space propulsion. To aid in the post-processing and analysis phase of CFD many researchers now use automated feature extraction utilities. These tools can be used to detect the existence of such features as shocks, vortex cores and separation and re-attachment lines. The existence of secondary flow is another feature of significant importance to CFD engineers. Although the concept of secondary flow is relatively understood there is no commonly accepted mathematical definition for secondary flow. This paper will present a definition for secondary flow and one approach for automatically detecting and visualizing secondary flow.