Sample records for objects include data-driven

  1. Objective, Quantitative, Data-Driven Assessment of Chemical Probes.

    PubMed

    Antolin, Albert A; Tym, Joseph E; Komianou, Angeliki; Collins, Ian; Workman, Paul; Al-Lazikani, Bissan

    2018-02-15

    Chemical probes are essential tools for understanding biological systems and for target validation, yet selecting probes for biomedical research is rarely based on objective assessment of all potential compounds. Here, we describe the Probe Miner: Chemical Probes Objective Assessment resource, capitalizing on the plethora of public medicinal chemistry data to empower quantitative, objective, data-driven evaluation of chemical probes. We assess >1.8 million compounds for their suitability as chemical tools against 2,220 human targets and dissect the biases and limitations encountered. Probe Miner represents a valuable resource to aid the identification of potential chemical probes, particularly when used alongside expert curation. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  2. Data-driven indexing mechanism for the recognition of polyhedral objects

    NASA Astrophysics Data System (ADS)

    McLean, Stewart; Horan, Peter; Caelli, Terry M.

    1992-02-01

    This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.

  3. A metadata-driven approach to data repository design.

    PubMed

    Harvey, Matthew J; McLean, Andrew; Rzepa, Henry S

    2017-01-01

    The design and use of a metadata-driven data repository for research data management is described. Metadata is collected automatically during the submission process whenever possible and is registered with DataCite in accordance with their current metadata schema, in exchange for a persistent digital object identifier. Two examples of data preview are illustrated, including the demonstration of a method for integration with commercial software that confers rich domain-specific data analytics without introducing customisation into the repository itself.

  4. Object-Driven and Temporal Action Rules Mining

    ERIC Educational Resources Information Center

    Hajja, Ayman

    2013-01-01

    In this thesis, I present my complete research work in the field of action rules, more precisely object-driven and temporal action rules. The drive behind the introduction of object-driven and temporally based action rules is to bring forth an adapted approach to extract action rules from a subclass of systems that have a specific nature, in which…

  5. Data-Driven Districts.

    ERIC Educational Resources Information Center

    LaFee, Scott

    2002-01-01

    Describes the use of data-driven decision-making in four school districts: Plainfield Public Schools, Plainfield, New Jersey; Palo Alto Unified School District, Palo Alto, California; Francis Howell School District in eastern Missouri, northwest of St. Louis; and Rio Rancho Public Schools, near Albuquerque, New Mexico. Includes interviews with the…

  6. Zero curvature-surface driven small objects

    NASA Astrophysics Data System (ADS)

    Dou, Xiaoxiao; Li, Shanpeng; Liu, Jianlin

    2017-08-01

    In this study, we investigate the spontaneous migration of small objects driven by surface tension on a catenoid, formed by a layer of soap constrained by two rings. Although the average curvature of the catenoid is zero at each point, the small objects always migrate to the position near the ring. The force and energy analyses have been performed to uncover the mechanism, and it is found that the small objects distort the local shape of the liquid film, thus making the whole system energetically favorable. These findings provide some inspiration to design microfluidics, aquatic robotics, and miniature boats.

  7. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application

    PubMed Central

    Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-01-01

    Background The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. Objective The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Methods Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. Results The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. Conclusions This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. PMID:28903894

  8. Sensor modeling and demonstration of a multi-object spectrometer for performance-driven sensing

    NASA Astrophysics Data System (ADS)

    Kerekes, John P.; Presnar, Michael D.; Fourspring, Kenneth D.; Ninkov, Zoran; Pogorzala, David R.; Raisanen, Alan D.; Rice, Andrew C.; Vasquez, Juan R.; Patel, Jeffrey P.; MacIntyre, Robert T.; Brown, Scott D.

    2009-05-01

    A novel multi-object spectrometer (MOS) is being explored for use as an adaptive performance-driven sensor that tracks moving targets. Developed originally for astronomical applications, the instrument utilizes an array of micromirrors to reflect light to a panchromatic imaging array. When an object of interest is detected the individual micromirrors imaging the object are tilted to reflect the light to a spectrometer to collect a full spectrum. This paper will present example sensor performance from empirical data collected in laboratory experiments, as well as our approach in designing optical and radiometric models of the MOS channels and the micromirror array. Simulation of moving vehicles in a highfidelity, hyperspectral scene is used to generate a dynamic video input for the adaptive sensor. Performance-driven algorithms for feature-aided target tracking and modality selection exploit multiple electromagnetic observables to track moving vehicle targets.

  9. Driving Ms. Data: Creating Data-Driven Possibilities

    ERIC Educational Resources Information Center

    Hoffman, Richard

    2005-01-01

    This article describes how driven Web sites help schools and districts maximize their IT resources by making online content more "self-service" for users. It shows how to set up the capacity to create data-driven sites. By definition, a data-driven Web site is one in which the content comes from some back-end data source, such as a…

  10. KNMI DataLab experiences in serving data-driven innovations

    NASA Astrophysics Data System (ADS)

    Noteboom, Jan Willem; Sluiter, Raymond

    2016-04-01

    Climate change research and innovations in weather forecasting rely more and more on (Big) data. Besides increasing data from traditional sources (such as observation networks, radars and satellites), the use of open data, crowd sourced data and the Internet of Things (IoT) is emerging. To deploy these sources of data optimally in our services and products, KNMI has established a DataLab to serve data-driven innovations in collaboration with public and private sector partners. Big data management, data integration, data analytics including machine learning and data visualization techniques are playing an important role in the DataLab. Cross-domain data-driven innovations that arise from public-private collaborative projects and research programmes can be explored, experimented and/or piloted by the KNMI DataLab. Furthermore, advice can be requested on (Big) data techniques and data sources. In support of collaborative (Big) data science activities, scalable environments are offered with facilities for data integration, data analysis and visualization. In addition, Data Science expertise is provided directly or from a pool of internal and external experts. At the EGU conference, gained experiences and best practices are presented in operating the KNMI DataLab to serve data-driven innovations for weather and climate applications optimally.

  11. Asynchronous Data Retrieval from an Object-Oriented Database

    NASA Astrophysics Data System (ADS)

    Gilbert, Jonathan P.; Bic, Lubomir

    We present an object-oriented semantic database model which, similar to other object-oriented systems, combines the virtues of four concepts: the functional data model, a property inheritance hierarchy, abstract data types and message-driven computation. The main emphasis is on the last of these four concepts. We describe generic procedures that permit queries to be processed in a purely message-driven manner. A database is represented as a network of nodes and directed arcs, in which each node is a logical processing element, capable of communicating with other nodes by exchanging messages. This eliminates the need for shared memory and for centralized control during query processing. Hence, the model is suitable for implementation on a multiprocessor computer architecture, consisting of large numbers of loosely coupled processing elements.

  12. Field Model: An Object-Oriented Data Model for Fields

    NASA Technical Reports Server (NTRS)

    Moran, Patrick J.

    2001-01-01

    We present an extensible, object-oriented data model designed for field data entitled Field Model (FM). FM objects can represent a wide variety of fields, including fields of arbitrary dimension and node type. FM can also handle time-series data. FM achieves generality through carefully selected topological primitives and through an implementation that leverages the potential of templated C++. FM supports fields where the nodes values are paired with any cell type. Thus FM can represent data where the field nodes are paired with the vertices ("vertex-centered" data), fields where the nodes are paired with the D-dimensional cells in R(sup D) (often called "cell-centered" data), as well as fields where nodes are paired with edges or other cell types. FM is designed to effectively handle very large data sets; in particular FM employs a demand-driven evaluation strategy that works especially well with large field data. Finally, the interfaces developed for FM have the potential to effectively abstract field data based on adaptive meshes. We present initial results with a triangular adaptive grid in R(sup 2) and discuss how the same design abstractions would work equally well with other adaptive-grid variations, including meshes in R(sup 3).

  13. Data Driven Decision Making in the Social Studies

    ERIC Educational Resources Information Center

    Ediger, Marlow

    2010-01-01

    Data driven decision making emphasizes the importance of the teacher using objective sources of information in developing the social studies curriculum. Too frequently, decisions of teachers have been made based on routine and outdated methods of teaching. Valid and reliable tests used to secure results from pupil learning make for better…

  14. Temporal Data-Driven Sleep Scheduling and Spatial Data-Driven Anomaly Detection for Clustered Wireless Sensor Networks

    PubMed Central

    Li, Gang; He, Bin; Huang, Hongwei; Tang, Limin

    2016-01-01

    The spatial–temporal correlation is an important feature of sensor data in wireless sensor networks (WSNs). Most of the existing works based on the spatial–temporal correlation can be divided into two parts: redundancy reduction and anomaly detection. These two parts are pursued separately in existing works. In this work, the combination of temporal data-driven sleep scheduling (TDSS) and spatial data-driven anomaly detection is proposed, where TDSS can reduce data redundancy. The TDSS model is inspired by transmission control protocol (TCP) congestion control. Based on long and linear cluster structure in the tunnel monitoring system, cooperative TDSS and spatial data-driven anomaly detection are then proposed. To realize synchronous acquisition in the same ring for analyzing the situation of every ring, TDSS is implemented in a cooperative way in the cluster. To keep the precision of sensor data, spatial data-driven anomaly detection based on the spatial correlation and Kriging method is realized to generate an anomaly indicator. The experiment results show that cooperative TDSS can realize non-uniform sensing effectively to reduce the energy consumption. In addition, spatial data-driven anomaly detection is quite significant for maintaining and improving the precision of sensor data. PMID:27690035

  15. Authoring Data-Driven Videos with DataClips.

    PubMed

    Amini, Fereshteh; Riche, Nathalie Henry; Lee, Bongshin; Monroy-Hernandez, Andres; Irani, Pourang

    2017-01-01

    Data videos, or short data-driven motion graphics, are an increasingly popular medium for storytelling. However, creating data videos is difficult as it involves pulling together a unique combination of skills. We introduce DataClips, an authoring tool aimed at lowering the barriers to crafting data videos. DataClips allows non-experts to assemble data-driven "clips" together to form longer sequences. We constructed the library of data clips by analyzing the composition of over 70 data videos produced by reputable sources such as The New York Times and The Guardian. We demonstrate that DataClips can reproduce over 90% of our data videos corpus. We also report on a qualitative study comparing the authoring process and outcome achieved by (1) non-experts using DataClips, and (2) experts using Adobe Illustrator and After Effects to create data-driven clips. Results indicated that non-experts are able to learn and use DataClips with a short training period. In the span of one hour, they were able to produce more videos than experts using a professional editing tool, and their clips were rated similarly by an independent audience.

  16. Making Data-Driven Decisions: Silent Reading

    ERIC Educational Resources Information Center

    Trudel, Heidi

    2007-01-01

    Due in part to conflicting opinions and research results, the practice of sustained silent reading (SSR) in schools has been questioned. After a frustrating experience with SSR, the author of this article began a data-driven decision-making process to gain new insights on how to structure silent reading in a classroom, including a comparison…

  17. Consistent data-driven computational mechanics

    NASA Astrophysics Data System (ADS)

    González, D.; Chinesta, F.; Cueto, E.

    2018-05-01

    We present a novel method, within the realm of data-driven computational mechanics, to obtain reliable and thermodynamically sound simulation from experimental data. We thus avoid the need to fit any phenomenological model in the construction of the simulation model. This kind of techniques opens unprecedented possibilities in the framework of data-driven application systems and, particularly, in the paradigm of industry 4.0.

  18. Data-driven discovery of new Dirac semimetal materials

    NASA Astrophysics Data System (ADS)

    Yan, Qimin; Chen, Ru; Neaton, Jeffrey

    In recent years, a significant amount of materials property data from high-throughput computations based on density functional theory (DFT) and the application of database technologies have enabled the rise of data-driven materials discovery. In this work, we initiate the extension of the data-driven materials discovery framework to the realm of topological semimetal materials and to accelerate the discovery of novel Dirac semimetals. We implement current available and develop new workflows to data-mine the Materials Project database for novel Dirac semimetals with desirable band structures and symmetry protected topological properties. This data-driven effort relies on the successful development of several automatic data generation and analysis tools, including a workflow for the automatic identification of topological invariants and pattern recognition techniques to find specific features in a massive number of computed band structures. Utilizing this approach, we successfully identified more than 15 novel Dirac point and Dirac nodal line systems that have not been theoretically predicted or experimentally identified. This work is supported by the Materials Project Predictive Modeling Center through the U.S. Department of Energy, Office of Basic Energy Sciences, Materials Sciences and Engineering Division, under Contract No. DE-AC02-05CH11231.

  19. Data-Driven Hierarchical Structure Kernel for Multiscale Part-Based Object Recognition

    PubMed Central

    Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Zheng, Yuan F.

    2017-01-01

    Detecting generic object categories in images and videos are a fundamental issue in computer vision. However, it faces the challenges from inter and intraclass diversity, as well as distortions caused by viewpoints, poses, deformations, and so on. To solve object variations, this paper constructs a structure kernel and proposes a multiscale part-based model incorporating the discriminative power of kernels. The structure kernel would measure the resemblance of part-based objects in three aspects: 1) the global similarity term to measure the resemblance of the global visual appearance of relevant objects; 2) the part similarity term to measure the resemblance of the visual appearance of distinctive parts; and 3) the spatial similarity term to measure the resemblance of the spatial layout of parts. In essence, the deformation of parts in the structure kernel is penalized in a multiscale space with respect to horizontal displacement, vertical displacement, and scale difference. Part similarities are combined with different weights, which are optimized efficiently to maximize the intraclass similarities and minimize the interclass similarities by the normalized stochastic gradient ascent algorithm. In addition, the parameters of the structure kernel are learned during the training process with regard to the distribution of the data in a more discriminative way. With flexible part sizes on scale and displacement, it can be more robust to the intraclass variations, poses, and viewpoints. Theoretical analysis and experimental evaluations demonstrate that the proposed multiscale part-based representation model with structure kernel exhibits accurate and robust performance, and outperforms state-of-the-art object classification approaches. PMID:24808345

  20. Can data-driven benchmarks be used to set the goals of healthy people 2010?

    PubMed Central

    Allison, J; Kiefe, C I; Weissman, N W

    1999-01-01

    OBJECTIVES: Expert panels determined the public health goals of Healthy People 2000 subjectively. The present study examined whether data-driven benchmarks provide a better alternative. METHODS: We developed the "pared-mean" method to define from data the best achievable health care practices. We calculated the pared-mean benchmark for screening mammography from the 1994 National Health Interview Survey, using the metropolitan statistical area as the "provider" unit. Beginning with the best-performing provider and adding providers in descending sequence, we established the minimum provider subset that included at least 10% of all women surveyed on this question. The pared-mean benchmark is then the proportion of women in this subset who received mammography. RESULTS: The pared-mean benchmark for screening mammography was 71%, compared with the Healthy People 2000 goal of 60%. CONCLUSIONS: For Healthy People 2010, benchmarks derived from data reflecting the best available care provide viable alternatives to consensus-derived targets. We are currently pursuing additional refinements to the data-driven pared-mean benchmark approach. PMID:9987466

  1. The influence of data-driven versus conceptually-driven processing on the development of PTSD-like symptoms.

    PubMed

    Kindt, Merel; van den Hout, Marcel; Arntz, Arnoud; Drost, Jolijn

    2008-12-01

    Ehlers and Clark [(2000). A cognitive model of posttraumatic stress disorder. Behaviour Research and Therapy, 38, 319-345] propose that a predominance of data-driven processing during the trauma predicts subsequent PTSD. We wondered whether, apart from data-driven encoding, sustained data-driven processing after the trauma is also crucial for the development of PTSD. Both hypotheses were tested in two analogue experiments. Experiment 1 demonstrated that relative to conceptually-driven processing (n=20), data-driven processing after the film (n=14), resulted in more intrusions. Experiment 2 demonstrated that relative to the neutral condition (n=24) and the data-driven encoding condition (n=24), conceptual encoding (n=25) reduced suppression of intrusions and a trend emerged for memory fragmentation. The difference between the two encoding styles was due to the beneficial effect of induced conceptual encoding and not to the detrimental effect of data-driven encoding. The data support the viability of the distinction between data-driven/conceptually-driven processing for the understanding of the development of PTSD.

  2. Efficiency of extracting stereo-driven object motions

    PubMed Central

    Jain, Anshul; Zaidi, Qasim

    2013-01-01

    Most living things and many nonliving things deform as they move, requiring observers to separate object motions from object deformations. When the object is partially occluded, the task becomes more difficult because it is not possible to use two-dimensional (2-D) contour correlations (Cohen, Jain, & Zaidi, 2010). That leaves dynamic depth matching across the unoccluded views as the main possibility. We examined the role of stereo cues in extracting motion of partially occluded and deforming three-dimensional (3-D) objects, simulated by disk-shaped random-dot stereograms set at randomly assigned depths and placed uniformly around a circle. The stereo-disparities of the disks were temporally oscillated to simulate clockwise or counterclockwise rotation of the global shape. To dynamically deform the global shape, random disparity perturbation was added to each disk's depth on each stimulus frame. At low perturbation, observers reported rotation directions consistent with the global shape, even against local motion cues, but performance deteriorated at high perturbation. Using 3-D global shape correlations, we formulated an optimal Bayesian discriminator for rotation direction. Based on rotation discrimination thresholds, human observers were 75% as efficient as the optimal model, demonstrating that global shapes derived from stereo cues facilitate inferences of object motions. To complement reports of stereo and motion integration in extrastriate cortex, our results suggest the possibilities that disparity selectivity and feature tracking are linked, or that global motion selective neurons can be driven purely from disparity cues. PMID:23325345

  3. Data-driven Science in Geochemistry & Petrology: Vision & Reality

    NASA Astrophysics Data System (ADS)

    Lehnert, K. A.; Ghiorso, M. S.; Spear, F. S.

    2013-12-01

    Science in many fields is increasingly ';data-driven'. Though referred to as a ';new' Fourth Paradigm (Hey, 2009), data-driven science is not new, and examples are cited in the Geochemical Society's data policy, including the compilation of Dziewonski & Anderson (1981) that led to PREM, and Zindler & Hart (1986), who compiled mantle isotope data to present for the first time a comprehensive view of the Earth's mantle. Today, rapidly growing data volumes, ubiquity of data access, and new computational and information management technologies enable data-driven science at a radically advanced scale of speed, extent, flexibility, and inclusiveness, with the ability to seamlessly synthesize observations, experiments, theory, and computation, and to statistically mine data across disciplines, leading to more comprehensive, well informed, and high impact scientific advances. Are geochemists, petrologists, and volcanologists ready to participate in this revolution of the scientific process? In the past year, researchers from the VGP community and related disciplines have come together at several cyberinfrastructure related workshops, in part prompted by the EarthCube initiative of the US NSF, to evaluate the status of cyberinfrastructure in their field, to put forth key scientific challenges, and identify primary data and software needs to address these. Science scenarios developed by workshop participants that range from non-equilibrium experiments focusing on mass transport, chemical reactions, and phase transformations (J. Hammer) to defining the abundance of elements and isotopes in every voxel in the Earth (W. McDonough), demonstrate the potential of cyberinfrastructure enabled science, and define the vision of how data access, visualization, analysis, computation, and cross-domain interoperability can and should support future research in VGP. The primary obstacle for data-driven science in VGP remains the dearth of accessible, integrated data from lab and sensor

  4. Building Data-Driven Pathways From Routinely Collected Hospital Data: A Case Study on Prostate Cancer

    PubMed Central

    Clark, Jeremy; Cooper, Colin S; Mills, Robert; Rayward-Smith, Victor J; de la Iglesia, Beatriz

    2015-01-01

    Background Routinely collected data in hospitals is complex, typically heterogeneous, and scattered across multiple Hospital Information Systems (HIS). This big data, created as a byproduct of health care activities, has the potential to provide a better understanding of diseases, unearth hidden patterns, and improve services and cost. The extent and uses of such data rely on its quality, which is not consistently checked, nor fully understood. Nevertheless, using routine data for the construction of data-driven clinical pathways, describing processes and trends, is a key topic receiving increasing attention in the literature. Traditional algorithms do not cope well with unstructured processes or data, and do not produce clinically meaningful visualizations. Supporting systems that provide additional information, context, and quality assurance inspection are needed. Objective The objective of the study is to explore how routine hospital data can be used to develop data-driven pathways that describe the journeys that patients take through care, and their potential uses in biomedical research; it proposes a framework for the construction, quality assessment, and visualization of patient pathways for clinical studies and decision support using a case study on prostate cancer. Methods Data pertaining to prostate cancer patients were extracted from a large UK hospital from eight different HIS, validated, and complemented with information from the local cancer registry. Data-driven pathways were built for each of the 1904 patients and an expert knowledge base, containing rules on the prostate cancer biomarker, was used to assess the completeness and utility of the pathways for a specific clinical study. Software components were built to provide meaningful visualizations for the constructed pathways. Results The proposed framework and pathway formalism enable the summarization, visualization, and querying of complex patient-centric clinical information, as well as the

  5. Data-Driven and Expectation-Driven Discovery of Empirical Laws.

    DTIC Science & Technology

    1982-10-10

    occurred in small integer proportions to each other. In 1809, Joseph Gay- Lussac found evidence for his law of combining volumes, which stated that a...of Empirical Laws Patrick W. Langley Gary L. Bradshaw Herbert A. Simon T1he Robotics Institute Carnegie-Mellon University Pittsburgh, Pennsylvania...Subtitle) S. TYPE OF REPORT & PERIOD COVERED Data-Driven and Expectation-Driven Discovery Interim Report 2/82-10/82 of Empirical Laws S. PERFORMING ORG

  6. Flow enhancement of deformable self-driven objects by countercurrent

    NASA Astrophysics Data System (ADS)

    Mashiko, Takashi; Fujiwara, Takashi

    2016-10-01

    We report numerical simulations of the mixed flows of two groups of deformable self-driven objects. The objects belonging to the group A (B) have drift coefficient D =DA (DB), where a positive (negative) value of D denotes the rightward (leftward) driving force. For co-current flows (DA ,DB > 0), the result is rather intuitive: the net flow of one group (QA) increases if the driving force of the other group is stronger than its own driving force (i.e., DB >DA), and decreases otherwise (DB objects and results from the entanglement of objects, which in turn is caused by their deformability.

  7. Data flow machine for data driven computing

    DOEpatents

    Davidson, George S.; Grafe, Victor G.

    1995-01-01

    A data flow computer which of computing is disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information form an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status but to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a "fire" signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor.

  8. Data-Driven Haptic Modeling and Rendering of Viscoelastic and Frictional Responses of Deformable Objects.

    PubMed

    Yim, Sunghoon; Jeon, Seokhee; Choi, Seungmoon

    2016-01-01

    In this paper, we present an extended data-driven haptic rendering method capable of reproducing force responses during pushing and sliding interaction on a large surface area. The main part of the approach is a novel input variable set for the training of an interpolation model, which incorporates the position of a proxy - an imaginary contact point on the undeformed surface. This allows us to estimate friction in both sliding and sticking states in a unified framework. Estimating the proxy position is done in real-time based on simulation using a sliding yield surface - a surface defining a border between the sliding and sticking regions in the external force space. During modeling, the sliding yield surface is first identified via an automated palpation procedure. Then, through manual palpation on a target surface, input data and resultant force data are acquired. The data are used to build a radial basis interpolation model. During rendering, this input-output mapping interpolation model is used to estimate force responses in real-time in accordance with the interaction input. Physical performance evaluation demonstrates that our approach achieves reasonably high estimation accuracy. A user study also shows plausible perceptual realism under diverse and extensive exploration.

  9. Data flow machine for data driven computing

    DOEpatents

    Davidson, G.S.; Grafe, V.G.

    1988-07-22

    A data flow computer and method of computing is disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information from an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status bit to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a ''fire'' signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor. 11 figs.

  10. The Data-Driven Approach to Spectroscopic Analyses

    NASA Astrophysics Data System (ADS)

    Ness, M.

    2018-01-01

    I review the data-driven approach to spectroscopy, The Cannon, which is a method for deriving fundamental diagnostics of galaxy formation of precise chemical compositions and stellar ages, across many stellar surveys that are mapping the Milky Way. With The Cannon, the abundances and stellar parameters from the multitude of stellar surveys can be placed directly on the same scale, using stars in common between the surveys. Furthermore, the information that resides in the data can be fully extracted, this has resulted in higher precision stellar parameters and abundances being delivered from spectroscopic data and has opened up new avenues in galactic archeology, for example, in the determination of ages for red giant stars across the Galactic disk. Coupled with Gaia distances, proper motions, and derived orbit families, the stellar age and individual abundance information delivered at the precision obtained with the data-driven approach provides very strong constraints on the evolution of and birthplace of stars in the Milky Way. I will review the role of data-driven spectroscopy as we enter the era where we have both the data and the tools to build the ultimate conglomerate of galactic information as well as highlight further applications of data-driven models in the coming decade.

  11. A Data-Driven Approach to Interactive Visualization of Power Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhu, Jun

    Driven by emerging industry standards, electric utilities and grid coordination organizations are eager to seek advanced tools to assist grid operators to perform mission-critical tasks and enable them to make quick and accurate decisions. The emerging field of visual analytics holds tremendous promise for improving the business practices in today’s electric power industry. The conducted investigation, however, has revealed that the existing commercial power grid visualization tools heavily rely on human designers, hindering user’s ability to discover. Additionally, for a large grid, it is very labor-intensive and costly to build and maintain the pre-designed visual displays. This project proposes amore » data-driven approach to overcome the common challenges. The proposed approach relies on developing powerful data manipulation algorithms to create visualizations based on the characteristics of empirically or mathematically derived data. The resulting visual presentations emphasize what the data is rather than how the data should be presented, thus fostering comprehension and discovery. Furthermore, the data-driven approach formulates visualizations on-the-fly. It does not require a visualization design stage, completely eliminating or significantly reducing the cost for building and maintaining visual displays. The research and development (R&D) conducted in this project is mainly divided into two phases. The first phase (Phase I & II) focuses on developing data driven techniques for visualization of power grid and its operation. Various data-driven visualization techniques were investigated, including pattern recognition for auto-generation of one-line diagrams, fuzzy model based rich data visualization for situational awareness, etc. The R&D conducted during the second phase (Phase IIB) focuses on enhancing the prototyped data driven visualization tool based on the gathered requirements and use cases. The goal is to evolve the prototyped tool developed

  12. Data-driven medicinal chemistry in the era of big data.

    PubMed

    Lusher, Scott J; McGuire, Ross; van Schaik, René C; Nicholson, C David; de Vlieg, Jacob

    2014-07-01

    Science, and the way we undertake research, is changing. The increasing rate of data generation across all scientific disciplines is providing incredible opportunities for data-driven research, with the potential to transform our current practices. The exploitation of so-called 'big data' will enable us to undertake research projects never previously possible but should also stimulate a re-evaluation of all our data practices. Data-driven medicinal chemistry approaches have the potential to improve decision making in drug discovery projects, providing that all researchers embrace the role of 'data scientist' and uncover the meaningful relationships and patterns in available data. Copyright © 2013 Elsevier Ltd. All rights reserved.

  13. 3D Visual Data-Driven Spatiotemporal Deformations for Non-Rigid Object Grasping Using Robot Hands.

    PubMed

    Mateo, Carlos M; Gil, Pablo; Torres, Fernando

    2016-05-05

    Sensing techniques are important for solving problems of uncertainty inherent to intelligent grasping tasks. The main goal here is to present a visual sensing system based on range imaging technology for robot manipulation of non-rigid objects. Our proposal provides a suitable visual perception system of complex grasping tasks to support a robot controller when other sensor systems, such as tactile and force, are not able to obtain useful data relevant to the grasping manipulation task. In particular, a new visual approach based on RGBD data was implemented to help a robot controller carry out intelligent manipulation tasks with flexible objects. The proposed method supervises the interaction between the grasped object and the robot hand in order to avoid poor contact between the fingertips and an object when there is neither force nor pressure data. This new approach is also used to measure changes to the shape of an object's surfaces and so allows us to find deformations caused by inappropriate pressure being applied by the hand's fingers. Test was carried out for grasping tasks involving several flexible household objects with a multi-fingered robot hand working in real time. Our approach generates pulses from the deformation detection method and sends an event message to the robot controller when surface deformation is detected. In comparison with other methods, the obtained results reveal that our visual pipeline does not use deformations models of objects and materials, as well as the approach works well both planar and 3D household objects in real time. In addition, our method does not depend on the pose of the robot hand because the location of the reference system is computed from a recognition process of a pattern located place at the robot forearm. The presented experiments demonstrate that the proposed method accomplishes a good monitoring of grasping task with several objects and different grasping configurations in indoor environments.

  14. Protein-Protein Interface Predictions by Data-Driven Methods: A Review

    PubMed Central

    Xue, Li C; Dobbs, Drena; Bonvin, Alexandre M.J.J.; Honavar, Vasant

    2015-01-01

    Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction. PMID:26460190

  15. Data driven propulsion system weight prediction model

    NASA Astrophysics Data System (ADS)

    Gerth, Richard J.

    1994-10-01

    The objective of the research was to develop a method to predict the weight of paper engines, i.e., engines that are in the early stages of development. The impetus for the project was the Single Stage To Orbit (SSTO) project, where engineers need to evaluate alternative engine designs. Since the SSTO is a performance driven project the performance models for alternative designs were well understood. The next tradeoff is weight. Since it is known that engine weight varies with thrust levels, a model is required that would allow discrimination between engines that produce the same thrust. Above all, the model had to be rooted in data with assumptions that could be justified based on the data. The general approach was to collect data on as many existing engines as possible and build a statistical model of the engines weight as a function of various component performance parameters. This was considered a reasonable level to begin the project because the data would be readily available, and it would be at the level of most paper engines, prior to detailed component design.

  16. Using Two Different Approaches to Assess Dietary Patterns: Hypothesis-Driven and Data-Driven Analysis.

    PubMed

    Previdelli, Ágatha Nogueira; de Andrade, Samantha Caesar; Fisberg, Regina Mara; Marchioni, Dirce Maria

    2016-09-23

    The use of dietary patterns to assess dietary intake has become increasingly common in nutritional epidemiology studies due to the complexity and multidimensionality of the diet. Currently, two main approaches have been widely used to assess dietary patterns: data-driven and hypothesis-driven analysis. Since the methods explore different angles of dietary intake, using both approaches simultaneously might yield complementary and useful information; thus, we aimed to use both approaches to gain knowledge of adolescents' dietary patterns. Food intake from a cross-sectional survey with 295 adolescents was assessed by 24 h dietary recall (24HR). In hypothesis-driven analysis, based on the American National Cancer Institute method, the usual intake of Brazilian Healthy Eating Index Revised components were estimated. In the data-driven approach, the usual intake of foods/food groups was estimated by the Multiple Source Method. In the results, hypothesis-driven analysis showed low scores for Whole grains, Total vegetables, Total fruit and Whole fruits), while, in data-driven analysis, fruits and whole grains were not presented in any pattern. High intakes of sodium, fats and sugars were observed in hypothesis-driven analysis with low total scores for Sodium, Saturated fat and SoFAA (calories from solid fat, alcohol and added sugar) components in agreement, while the data-driven approach showed the intake of several foods/food groups rich in these nutrients, such as butter/margarine, cookies, chocolate powder, whole milk, cheese, processed meat/cold cuts and candies. In this study, using both approaches at the same time provided consistent and complementary information with regard to assessing the overall dietary habits that will be important in order to drive public health programs, and improve their efficiency to monitor and evaluate the dietary patterns of populations.

  17. 3D Visual Data-Driven Spatiotemporal Deformations for Non-Rigid Object Grasping Using Robot Hands

    PubMed Central

    Mateo, Carlos M.; Gil, Pablo; Torres, Fernando

    2016-01-01

    Sensing techniques are important for solving problems of uncertainty inherent to intelligent grasping tasks. The main goal here is to present a visual sensing system based on range imaging technology for robot manipulation of non-rigid objects. Our proposal provides a suitable visual perception system of complex grasping tasks to support a robot controller when other sensor systems, such as tactile and force, are not able to obtain useful data relevant to the grasping manipulation task. In particular, a new visual approach based on RGBD data was implemented to help a robot controller carry out intelligent manipulation tasks with flexible objects. The proposed method supervises the interaction between the grasped object and the robot hand in order to avoid poor contact between the fingertips and an object when there is neither force nor pressure data. This new approach is also used to measure changes to the shape of an object’s surfaces and so allows us to find deformations caused by inappropriate pressure being applied by the hand’s fingers. Test was carried out for grasping tasks involving several flexible household objects with a multi-fingered robot hand working in real time. Our approach generates pulses from the deformation detection method and sends an event message to the robot controller when surface deformation is detected. In comparison with other methods, the obtained results reveal that our visual pipeline does not use deformations models of objects and materials, as well as the approach works well both planar and 3D household objects in real time. In addition, our method does not depend on the pose of the robot hand because the location of the reference system is computed from a recognition process of a pattern located place at the robot forearm. The presented experiments demonstrate that the proposed method accomplishes a good monitoring of grasping task with several objects and different grasping configurations in indoor environments. PMID

  18. Data-Driven Instructional Leadership

    ERIC Educational Resources Information Center

    Blink, Rebecca

    2006-01-01

    With real-world examples from actual schools, this book illustrates how to nurture a culture of continuous improvement, meet the needs of individual students, foster an environment of high expectations, and meet the requirements of NCLB. Each component of the Data-Driven Instructional Leadership (DDIS) model represents several branches of…

  19. What Data for Data-Driven Learning?

    ERIC Educational Resources Information Center

    Boulton, Alex

    2012-01-01

    Corpora have multiple affordances, not least for use by teachers and learners of a foreign language (L2) in what has come to be known as "data-driven learning" or DDL. The corpus and concordance interface were originally conceived by and for linguists, so other users need to adopt the role of "language researcher" to make the most of them. Despite…

  20. The Structural Consequences of Big Data-Driven Education.

    PubMed

    Zeide, Elana

    2017-06-01

    Educators and commenters who evaluate big data-driven learning environments focus on specific questions: whether automated education platforms improve learning outcomes, invade student privacy, and promote equality. This article puts aside separate unresolved-and perhaps unresolvable-issues regarding the concrete effects of specific technologies. It instead examines how big data-driven tools alter the structure of schools' pedagogical decision-making, and, in doing so, change fundamental aspects of America's education enterprise. Technological mediation and data-driven decision-making have a particularly significant impact in learning environments because the education process primarily consists of dynamic information exchange. In this overview, I highlight three significant structural shifts that accompany school reliance on data-driven instructional platforms that perform core school functions: teaching, assessment, and credentialing. First, virtual learning environments create information technology infrastructures featuring constant data collection, continuous algorithmic assessment, and possibly infinite record retention. This undermines the traditional intellectual privacy and safety of classrooms. Second, these systems displace pedagogical decision-making from educators serving public interests to private, often for-profit, technology providers. They constrain teachers' academic autonomy, obscure student evaluation, and reduce parents' and students' ability to participate or challenge education decision-making. Third, big data-driven tools define what "counts" as education by mapping the concepts, creating the content, determining the metrics, and setting desired learning outcomes of instruction. These shifts cede important decision-making to private entities without public scrutiny or pedagogical examination. In contrast to the public and heated debates that accompany textbook choices, schools often adopt education technologies ad hoc. Given education

  1. A VO-Driven Astronomical Data Grid in China

    NASA Astrophysics Data System (ADS)

    Cui, C.; He, B.; Yang, Y.; Zhao, Y.

    2010-12-01

    With the implementation of many ambitious observation projects, including LAMOST, FAST, and Antarctic observatory at Doom A, observational astronomy in China is stepping into a brand new era with emerging data avalanche. In the era of e-Science, both these cutting-edge projects and traditional astronomy research need much more powerful data management, sharing and interoperability. Based on data-grid concept, taking advantages of the IVOA interoperability technologies, China-VO is developing a VO-driven astronomical data grid environment to enable multi-wavelength science and large database science. In the paper, latest progress and data flow of the LAMOST, architecture of the data grid, and its supports to the VO are discussed.

  2. General Purpose Data-Driven Monitoring for Space Operations

    NASA Technical Reports Server (NTRS)

    Iverson, David L.; Martin, Rodney A.; Schwabacher, Mark A.; Spirkovska, Liljana; Taylor, William McCaa; Castle, Joseph P.; Mackey, Ryan M.

    2009-01-01

    As modern space propulsion and exploration systems improve in capability and efficiency, their designs are becoming increasingly sophisticated and complex. Determining the health state of these systems, using traditional parameter limit checking, model-based, or rule-based methods, is becoming more difficult as the number of sensors and component interactions grow. Data-driven monitoring techniques have been developed to address these issues by analyzing system operations data to automatically characterize normal system behavior. System health can be monitored by comparing real-time operating data with these nominal characterizations, providing detection of anomalous data signatures indicative of system faults or failures. The Inductive Monitoring System (IMS) is a data-driven system health monitoring software tool that has been successfully applied to several aerospace applications. IMS uses a data mining technique called clustering to analyze archived system data and characterize normal interactions between parameters. The scope of IMS based data-driven monitoring applications continues to expand with current development activities. Successful IMS deployment in the International Space Station (ISS) flight control room to monitor ISS attitude control systems has led to applications in other ISS flight control disciplines, such as thermal control. It has also generated interest in data-driven monitoring capability for Constellation, NASA's program to replace the Space Shuttle with new launch vehicles and spacecraft capable of returning astronauts to the moon, and then on to Mars. Several projects are currently underway to evaluate and mature the IMS technology and complementary tools for use in the Constellation program. These include an experiment on board the Air Force TacSat-3 satellite, and ground systems monitoring for NASA's Ares I-X and Ares I launch vehicles. The TacSat-3 Vehicle System Management (TVSM) project is a software experiment to integrate fault

  3. Data Driven Math Intervention: What the Numbers Say

    ERIC Educational Resources Information Center

    Martin, Anthony W.

    2013-01-01

    This study was designed to determine whether or not data driven math skills groups would be effective in increasing student academic achievement. From this topic three key questions arose: "Would the implementation of data driven math skills groups improve student academic achievement more than standard instruction as measured by the…

  4. Data-Driven School Administrator Behaviors and State Report Card Results

    ERIC Educational Resources Information Center

    Spencer, James A., Jr.

    2014-01-01

    The purpose of this study was to identify the principal behaviors that would define an instructional leader as being a data-driven school administrator and to assess current school administrators' levels of being data-driven. This research attempted to examine the relationship between the degree to which a principal was data-driven and the…

  5. A Perfect Time for Data Use: Using Data-Driven Decision Making to Inform Practice

    ERIC Educational Resources Information Center

    Mandinach, Ellen B.

    2012-01-01

    Data-driven decision making has become an essential component of educational practice across all levels, from chief state school officers to classroom teachers, and has received unprecedented attention in terms of policy and financial support. It was included as one of the four pillars in the American Recovery and Reinvestment Act (2009),…

  6. Paving the COWpath: data-driven design of pediatric order sets

    PubMed Central

    Zhang, Yiye; Padman, Rema; Levin, James E

    2014-01-01

    Objective Evidence indicates that users incur significant physical and cognitive costs in the use of order sets, a core feature of computerized provider order entry systems. This paper develops data-driven approaches for automating the construction of order sets that match closely with user preferences and workflow while minimizing physical and cognitive workload. Materials and methods We developed and tested optimization-based models embedded with clustering techniques using physical and cognitive click cost criteria. By judiciously learning from users’ actual actions, our methods identify items for constituting order sets that are relevant according to historical ordering data and grouped on the basis of order similarity and ordering time. We evaluated performance of the methods using 47 099 orders from the year 2011 for asthma, appendectomy and pneumonia management in a pediatric inpatient setting. Results In comparison with existing order sets, those developed using the new approach significantly reduce the physical and cognitive workload associated with usage by 14–52%. This approach is also capable of accommodating variations in clinical conditions that affect order set usage and development. Discussion There is a critical need to investigate the cognitive complexity imposed on users by complex clinical information systems, and to design their features according to ‘human factors’ best practices. Optimizing order set generation using cognitive cost criteria introduces a new approach that can potentially improve ordering efficiency, reduce unintended variations in order placement, and enhance patient safety. Conclusions We demonstrate that data-driven methods offer a promising approach for designing order sets that are generalizable, data-driven, condition-based, and up to date with current best practices. PMID:24674844

  7. Design of a data-driven predictive controller for start-up process of AMT vehicles.

    PubMed

    Lu, Xiaohui; Chen, Hong; Wang, Ping; Gao, Bingzhao

    2011-12-01

    In this paper, a data-driven predictive controller is designed for the start-up process of vehicles with automated manual transmissions (AMTs). It is obtained directly from the input-output data of a driveline simulation model constructed by the commercial software AMESim. In order to obtain offset-free control for the reference input, the predictor equation is gained with incremental inputs and outputs. Because of the physical characteristics, the input and output constraints are considered explicitly in the problem formulation. The contradictory requirements of less friction losses and less driveline shock are included in the objective function. The designed controller is tested under nominal conditions and changed conditions. The simulation results show that, during the start-up process, the AMT clutch with the proposed controller works very well, and the process meets the control objectives: fast clutch lockup time, small friction losses, and the preservation of driver comfort, i.e., smooth acceleration of the vehicle. At the same time, the closed-loop system has the ability to reject uncertainties, such as the vehicle mass and road grade.

  8. Direct match data flow memory for data driven computing

    DOEpatents

    Davidson, George S.; Grafe, Victor Gerald

    1997-01-01

    A data flow computer and method of computing is disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information form an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status bit to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a "fire" signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor.

  9. Data-Driven Hint Generation from Peer Debugging Solutions

    ERIC Educational Resources Information Center

    Liu, Zhongxiu

    2015-01-01

    Data-driven methods have been a successful approach to generating hints for programming problems. However, the majority of previous studies are focused on procedural hints that aim at moving students to the next closest state to the solution. In this paper, I propose a data-driven method to generate remedy hints for BOTS, a game that teaches…

  10. Automated object-based classification of topography from SRTM data

    PubMed Central

    Drăguţ, Lucian; Eisank, Clemens

    2012-01-01

    We introduce an object-based method to automatically classify topography from SRTM data. The new method relies on the concept of decomposing land-surface complexity into more homogeneous domains. An elevation layer is automatically segmented and classified at three scale levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these appropriate scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble reasonably patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of classes satisfy the regionalization requirements of maximizing internal homogeneity while minimizing external homogeneity. Most objects have boundaries matching natural discontinuities at regional level. The method is simple and fully automated. The input data consist of only one layer, which does not need any pre-processing. Both segmentation and classification rely on only two parameters: elevation and standard deviation of elevation. The methodology is implemented as a customized process for the eCognition® software, available as online download. The results are embedded in a web application with functionalities of visualization and download. PMID:22485060

  11. Automated object-based classification of topography from SRTM data

    NASA Astrophysics Data System (ADS)

    Drăguţ, Lucian; Eisank, Clemens

    2012-03-01

    We introduce an object-based method to automatically classify topography from SRTM data. The new method relies on the concept of decomposing land-surface complexity into more homogeneous domains. An elevation layer is automatically segmented and classified at three scale levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these appropriate scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble reasonably patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of classes satisfy the regionalization requirements of maximizing internal homogeneity while minimizing external homogeneity. Most objects have boundaries matching natural discontinuities at regional level. The method is simple and fully automated. The input data consist of only one layer, which does not need any pre-processing. Both segmentation and classification rely on only two parameters: elevation and standard deviation of elevation. The methodology is implemented as a customized process for the eCognition® software, available as online download. The results are embedded in a web application with functionalities of visualization and download.

  12. C-arm technique using distance driven method for nephrolithiasis and kidney stones detection

    NASA Astrophysics Data System (ADS)

    Malalla, Nuhad; Sun, Pengfei; Chen, Ying; Lipkin, Michael E.; Preminger, Glenn M.; Qin, Jun

    2016-04-01

    Distance driven represents a state of art method that used for reconstruction for x-ray techniques. C-arm tomography is an x-ray imaging technique that provides three dimensional information of the object by moving the C-shaped gantry around the patient. With limited view angle, C-arm system was investigated to generate volumetric data of the object with low radiation dosage and examination time. This paper is a new simulation study with two reconstruction methods based on distance driven including: simultaneous algebraic reconstruction technique (SART) and Maximum Likelihood expectation maximization (MLEM). Distance driven is an efficient method that has low computation cost and free artifacts compared with other methods such as ray driven and pixel driven methods. Projection images of spherical objects were simulated with a virtual C-arm system with a total view angle of 40 degrees. Results show the ability of limited angle C-arm technique to generate three dimensional images with distance driven reconstruction.

  13. Data-driven grasp synthesis using shape matching and task-based pruning.

    PubMed

    Li, Ying; Fu, Jiaxin L; Pollard, Nancy S

    2007-01-01

    Human grasps, especially whole-hand grasps, are difficult to animate because of the high number of degrees of freedom of the hand and the need for the hand to conform naturally to the object surface. Captured human motion data provides us with a rich source of examples of natural grasps. However, for each new object, we are faced with the problem of selecting the best grasp from the database and adapting it to that object. This paper presents a data-driven approach to grasp synthesis. We begin with a database of captured human grasps. To identify candidate grasps for a new object, we introduce a novel shape matching algorithm that matches hand shape to object shape by identifying collections of features having similar relative placements and surface normals. This step returns many grasp candidates, which are clustered and pruned by choosing the grasp best suited for the intended task. For pruning undesirable grasps, we develop an anatomically-based grasp quality measure specific to the human hand. Examples of grasp synthesis are shown for a variety of objects not present in the original database. This algorithm should be useful both as an animator tool for posing the hand and for automatic grasp synthesis in virtual environments.

  14. Data-driven non-Markovian closure models

    NASA Astrophysics Data System (ADS)

    Kondrashov, Dmitri; Chekroun, Mickaël D.; Ghil, Michael

    2015-03-01

    This paper has two interrelated foci: (i) obtaining stable and efficient data-driven closure models by using a multivariate time series of partial observations from a large-dimensional system; and (ii) comparing these closure models with the optimal closures predicted by the Mori-Zwanzig (MZ) formalism of statistical physics. Multilayer stochastic models (MSMs) are introduced as both a generalization and a time-continuous limit of existing multilevel, regression-based approaches to closure in a data-driven setting; these approaches include empirical model reduction (EMR), as well as more recent multi-layer modeling. It is shown that the multilayer structure of MSMs can provide a natural Markov approximation to the generalized Langevin equation (GLE) of the MZ formalism. A simple correlation-based stopping criterion for an EMR-MSM model is derived to assess how well it approximates the GLE solution. Sufficient conditions are derived on the structure of the nonlinear cross-interactions between the constitutive layers of a given MSM to guarantee the existence of a global random attractor. This existence ensures that no blow-up can occur for a broad class of MSM applications, a class that includes non-polynomial predictors and nonlinearities that do not necessarily preserve quadratic energy invariants. The EMR-MSM methodology is first applied to a conceptual, nonlinear, stochastic climate model of coupled slow and fast variables, in which only slow variables are observed. It is shown that the resulting closure model with energy-conserving nonlinearities efficiently captures the main statistical features of the slow variables, even when there is no formal scale separation and the fast variables are quite energetic. Second, an MSM is shown to successfully reproduce the statistics of a partially observed, generalized Lotka-Volterra model of population dynamics in its chaotic regime. The challenges here include the rarity of strange attractors in the model's parameter

  15. Value Driven Information Processing and Fusion

    DTIC Science & Technology

    2016-03-01

    consensus approach allows a decentralized approach to achieve the optimal error exponent of the centralized counterpart, a conclusion that is signifi...SECURITY CLASSIFICATION OF: The objective of the project is to develop a general framework for value driven decentralized information processing...including: optimal data reduction in a network setting for decentralized inference with quantization constraint; interactive fusion that allows queries and

  16. Direct match data flow memory for data driven computing

    DOEpatents

    Davidson, G.S.; Grafe, V.G.

    1997-10-07

    A data flow computer and method of computing is disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information form an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status bit to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a ``fire`` signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor. 11 figs.

  17. Materials discovery guided by data-driven insights

    NASA Astrophysics Data System (ADS)

    Klintenberg, Mattias

    As the computational power continues to grow systematic computational exploration has become an important tool for materials discovery. In this presentation the Electronic Structure Project (ESP/ELSA) will be discussed and a number of examples presented that show some of the capabilities of a data-driven methodology for guiding materials discovery. These examples include topological insulators, detector materials and 2D materials. ESP/ELSA is an initiative that dates back to 2001 and today contain many tens of thousands of materials that have been investigated using a robust and high accuracy electronic structure method (all-electron FP-LMTO) thus providing basic materials first-principles data for most inorganic compounds that have been structurally characterized. The web-site containing the ESP/ELSA data has as of today been accessed from more than 4,000 unique computers from all around the world.

  18. Dynamic Data Driven Methods for Self-aware Aerospace Vehicles

    DTIC Science & Technology

    2015-04-08

    structural response model that incorporates multiple degradation or failure modes including damaged panel strength (BVID, thru- hole ), damaged panel...stiffness (BVID, thru- hole ), loose fastener, fretted fastener hole , and disbonded surface. • A new data-driven approach for the online updating of the flight...between the first and second plies. The panels were reinforced around the boarders of the panel with through holes to simulate mounting the wing skins to

  19. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application.

    PubMed

    Peissig, Peggy; Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-09-13

    The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. ©Peggy Peissig, Kelsey M Schwei, Christopher Kadolph, Joseph Finamore, Efrain Cancel, Catherine A McCarty, Asha Okorie, Kate L Thomas, Jennifer Allen Pacheco, Jyotishman Pathak, Stephen B Ellis, Joshua C Denny, Luke V Rasmussen, Gerard Tromp, Marc S Williams, Tamara R Vrabec, Murray H Brilliant. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.09.2017.

  20. Scenario driven data modelling: a method for integrating diverse sources of data and data streams

    PubMed Central

    2011-01-01

    Background Biology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways. Results This paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements. Conclusions We provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting

  1. Data-driven Modelling for decision making under uncertainty

    NASA Astrophysics Data System (ADS)

    Angria S, Layla; Dwi Sari, Yunita; Zarlis, Muhammad; Tulus

    2018-01-01

    The rise of the issues with the uncertainty of decision making has become a very warm conversation in operation research. Many models have been presented, one of which is with data-driven modelling (DDM). The purpose of this paper is to extract and recognize patterns in data, and find the best model in decision-making problem under uncertainty by using data-driven modeling approach with linear programming, linear and nonlinear differential equation, bayesian approach. Model criteria tested to determine the smallest error, and it will be the best model that can be used.

  2. The application of data mining and cloud computing techniques in data-driven models for structural health monitoring

    NASA Astrophysics Data System (ADS)

    Khazaeli, S.; Ravandi, A. G.; Banerji, S.; Bagchi, A.

    2016-04-01

    Recently, data-driven models for Structural Health Monitoring (SHM) have been of great interest among many researchers. In data-driven models, the sensed data are processed to determine the structural performance and evaluate the damages of an instrumented structure without necessitating the mathematical modeling of the structure. A framework of data-driven models for online assessment of the condition of a structure has been developed here. The developed framework is intended for automated evaluation of the monitoring data and structural performance by the Internet technology and resources. The main challenges in developing such framework include: (a) utilizing the sensor measurements to estimate and localize the induced damage in a structure by means of signal processing and data mining techniques, and (b) optimizing the computing and storage resources with the aid of cloud services. The main focus in this paper is to demonstrate the efficiency of the proposed framework for real-time damage detection of a multi-story shear-building structure in two damage scenarios (change in mass and stiffness) in various locations. Several features are extracted from the sensed data by signal processing techniques and statistical methods. Machine learning algorithms are deployed to select damage-sensitive features as well as classifying the data to trace the anomaly in the response of the structure. Here, the cloud computing resources from Amazon Web Services (AWS) have been used to implement the proposed framework.

  3. Data driven innovations in structural health monitoring

    NASA Astrophysics Data System (ADS)

    Rosales, M. J.; Liyanapathirana, R.

    2017-05-01

    At present, substantial investments are being allocated to civil infrastructures also considered as valuable assets at a national or global scale. Structural Health Monitoring (SHM) is an indispensable tool required to ensure the performance and safety of these structures based on measured response parameters. The research to date on damage assessment has tended to focus on the utilization of wireless sensor networks (WSN) as it proves to be the best alternative over the traditional visual inspections and tethered or wired counterparts. Over the last decade, the structural health and behaviour of innumerable infrastructure has been measured and evaluated owing to several successful ventures of implementing these sensor networks. Various monitoring systems have the capability to rapidly transmit, measure, and store large capacities of data. The amount of data collected from these networks have eventually been unmanageable which paved the way to other relevant issues such as data quality, relevance, re-use, and decision support. There is an increasing need to integrate new technologies in order to automate the evaluation processes as well as to enhance the objectivity of data assessment routines. This paper aims to identify feasible methodologies towards the application of time-series analysis techniques to judiciously exploit the vast amount of readily available as well as the upcoming data resources. It continues the momentum of a greater effort to collect and archive SHM approaches that will serve as data-driven innovations for the assessment of damage through efficient algorithms and data analytics.

  4. Transferring data objects: A focused Ada investigation

    NASA Technical Reports Server (NTRS)

    Legrand, Sue

    1988-01-01

    The use of the Ada language does not guarantee that data objects will be in the same form or have the same value after they have been stored or transferred to another system. There are too many possible variables in such things as the formats used and other protocol conditions. Differences may occur at many different levels of support. These include program level, object level, application level, and system level. A standard language is only one aspect of making a complex system completely homogeneous. Many components must be standardized and the various standards must be integrated. The principal issues in providing for interaction between systems are of exchanging files and data objects between systems which may not be compatible in terms of their host computer, operating system or other factors. A typical resolution of the problem of invalidating data involves at least a common external form, for data objects and for representing the relationships and attributes of data collections. Some of the issues dealing with the transfer of data are listed and consideration is given on how these issues may be handled in the Ada language.

  5. A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty

    USGS Publications Warehouse

    Friedel, Michael J.

    2011-01-01

    This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.

  6. A data-driven approach to quality risk management

    PubMed Central

    Alemayehu, Demissie; Alvir, Jose; Levenstein, Marcia; Nickerson, David

    2013-01-01

    Aim: An effective clinical trial strategy to ensure patient safety as well as trial quality and efficiency involves an integrated approach, including prospective identification of risk factors, mitigation of the risks through proper study design and execution, and assessment of quality metrics in real-time. Such an integrated quality management plan may also be enhanced by using data-driven techniques to identify risk factors that are most relevant in predicting quality issues associated with a trial. In this paper, we illustrate such an approach using data collected from actual clinical trials. Materials and Methods: Several statistical methods were employed, including the Wilcoxon rank-sum test and logistic regression, to identify the presence of association between risk factors and the occurrence of quality issues, applied to data on quality of clinical trials sponsored by Pfizer. Results: Only a subset of the risk factors had a significant association with quality issues, and included: Whether study used Placebo, whether an agent was a biologic, unusual packaging label, complex dosing, and over 25 planned procedures. Conclusion: Proper implementation of the strategy can help to optimize resource utilization without compromising trial integrity and patient safety. PMID:24312890

  7. A data-driven approach to quality risk management.

    PubMed

    Alemayehu, Demissie; Alvir, Jose; Levenstein, Marcia; Nickerson, David

    2013-10-01

    An effective clinical trial strategy to ensure patient safety as well as trial quality and efficiency involves an integrated approach, including prospective identification of risk factors, mitigation of the risks through proper study design and execution, and assessment of quality metrics in real-time. Such an integrated quality management plan may also be enhanced by using data-driven techniques to identify risk factors that are most relevant in predicting quality issues associated with a trial. In this paper, we illustrate such an approach using data collected from actual clinical trials. Several statistical methods were employed, including the Wilcoxon rank-sum test and logistic regression, to identify the presence of association between risk factors and the occurrence of quality issues, applied to data on quality of clinical trials sponsored by Pfizer. ONLY A SUBSET OF THE RISK FACTORS HAD A SIGNIFICANT ASSOCIATION WITH QUALITY ISSUES, AND INCLUDED: Whether study used Placebo, whether an agent was a biologic, unusual packaging label, complex dosing, and over 25 planned procedures. Proper implementation of the strategy can help to optimize resource utilization without compromising trial integrity and patient safety.

  8. Writing through Big Data: New Challenges and Possibilities for Data-Driven Arguments

    ERIC Educational Resources Information Center

    Beveridge, Aaron

    2017-01-01

    As multimodal writing continues to shift and expand in the era of Big Data, writing studies must confront the new challenges and possibilities emerging from data mining, data visualization, and data-driven arguments. Often collected under the broad banner of "data literacy," students' experiences of data visualization and data-driven…

  9. Mission Driven and Data Informed Leadership

    ERIC Educational Resources Information Center

    Holter, Anthony C.; Frabutt, James M.

    2012-01-01

    The contemporary challenges facing Catholic schools and Catholic school leaders are widely known. Effective and systemic solutions to these mounting challenges are less widely known or discussed. This article highlights the skills, knowledge, and dispositions associated with mission driven and data informed leadership--an orientation to school…

  10. Data-Driven Approaches to Empirical Discovery

    DTIC Science & Technology

    1988-10-31

    if nece ry and identify by block number) empirical discovery history of science data-driven heuristics numeric laws theoretical terms scope of laws...to the normative side. Machine Discovery and the History of Science The history of science studies the actual path followed by scientists over the

  11. DeDaL: Cytoscape 3 app for producing and morphing data-driven and structure-driven network layouts.

    PubMed

    Czerwinska, Urszula; Calzone, Laurence; Barillot, Emmanuel; Zinovyev, Andrei

    2015-08-14

    Visualization and analysis of molecular profiling data together with biological networks are able to provide new mechanistic insights into biological functions. Currently, it is possible to visualize high-throughput data on top of pre-defined network layouts, but they are not always adapted to a given data analysis task. A network layout based simultaneously on the network structure and the associated multidimensional data might be advantageous for data visualization and analysis in some cases. We developed a Cytoscape app, which allows constructing biological network layouts based on the data from molecular profiles imported as values of node attributes. DeDaL is a Cytoscape 3 app, which uses linear and non-linear algorithms of dimension reduction to produce data-driven network layouts based on multidimensional data (typically gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another one by rotating and mirroring. The combination of all these functionalities facilitates the creation of insightful network layouts representing both structural network features and correlation patterns in multivariate data. We demonstrate the added value of applying DeDaL in several practical applications, including an example of a large protein-protein interaction network. DeDaL is a convenient tool for applying data dimensionality reduction methods and for designing insightful data displays based on data-driven layouts of biological networks, built within Cytoscape environment. DeDaL is freely available for downloading at http://bioinfo-out.curie.fr/projects/dedal/.

  12. A data driven control method for structure vibration suppression

    NASA Astrophysics Data System (ADS)

    Xie, Yangmin; Wang, Chao; Shi, Hang; Shi, Junwei

    2018-02-01

    High radio-frequency space applications have motivated continuous research on vibration suppression of large space structures both in academia and industry. This paper introduces a novel data driven control method to suppress vibrations of flexible structures and experimentally validates the suppression performance. Unlike model-based control approaches, the data driven control method designs a controller directly from the input-output test data of the structure, without requiring parametric dynamics and hence free of system modeling. It utilizes the discrete frequency response via spectral analysis technique and formulates a non-convex optimization problem to obtain optimized controller parameters with a predefined controller structure. Such approach is then experimentally applied on an end-driving flexible beam-mass structure. The experiment results show that the presented method can achieve competitive disturbance rejections compared to a model-based mixed sensitivity controller under the same design criterion but with much less orders and design efforts, demonstrating the proposed data driven control is an effective approach for vibration suppression of flexible structures.

  13. Retrospective data-driven respiratory gating for PET/CT

    NASA Astrophysics Data System (ADS)

    Schleyer, Paul J.; O'Doherty, Michael J.; Barrington, Sally F.; Marsden, Paul K.

    2009-04-01

    Respiratory motion can adversely affect both PET and CT acquisitions. Respiratory gating allows an acquisition to be divided into a series of motion-reduced bins according to the respiratory signal, which is typically hardware acquired. In order that the effects of motion can potentially be corrected for, we have developed a novel, automatic, data-driven gating method which retrospectively derives the respiratory signal from the acquired PET and CT data. PET data are acquired in listmode and analysed in sinogram space, and CT data are acquired in cine mode and analysed in image space. Spectral analysis is used to identify regions within the CT and PET data which are subject to respiratory motion, and the variation of counts within these regions is used to estimate the respiratory signal. Amplitude binning is then used to create motion-reduced PET and CT frames. The method was demonstrated with four patient datasets acquired on a 4-slice PET/CT system. To assess the accuracy of the data-derived respiratory signal, a hardware-based signal was acquired for comparison. Data-driven gating was successfully performed on PET and CT datasets for all four patients. Gated images demonstrated respiratory motion throughout the bin sequences for all PET and CT series, and image analysis and direct comparison of the traces derived from the data-driven method with the hardware-acquired traces indicated accurate recovery of the respiratory signal.

  14. Application of Data-Driven Evidential Belief Functions to Prospectivity Mapping for Aquamarine-Bearing Pegmatites, Lundazi District, Zambia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carranza, E. J. M., E-mail: carranza@itc.nl; Woldai, T.; Chikambwe, E. M.

    A case application of data-driven estimation of evidential belief functions (EBFs) is demonstrated to prospectivity mapping in Lundazi district (eastern Zambia). Spatial data used to represent recognition criteria of prospectivity for aquamarine-bearing pegmatites include mapped granites, mapped faults/fractures, mapped shear zones, and radioelement concentration ratios derived from gridded airborne radiometric data. Data-driven estimates EBFs take into account not only (a) spatial association between an evidential map layer and target deposits but also (b) spatial relationships between classes of evidences in an evidential map layer. Data-driven estimates of EBFs can indicate which spatial data provide positive or negative evidence of prospectivity.more » Data-driven estimates of EBFs of only spatial data providing positive evidence of prospectivity were integrated according to Dempster's rule of combination. Map of integrated degrees of belief was used to delineate zones of relative degress of prospectivity for aquamarine-bearing pegmatites. The predictive map has at least 85% prediction rate and at least 79% success rate of delineating training and validation deposits, respectively. The results illustrate usefulness of data-driven estimation of EBFs in GIS-based predictive mapping of mineral prospectivity. The results also show usefulness of EBFs in managing uncertainties associated with evidential maps.« less

  15. A data-driven dynamics simulation framework for railway vehicles

    NASA Astrophysics Data System (ADS)

    Nie, Yinyu; Tang, Zhao; Liu, Fengjia; Chang, Jian; Zhang, Jianjun

    2018-03-01

    The finite element (FE) method is essential for simulating vehicle dynamics with fine details, especially for train crash simulations. However, factors such as the complexity of meshes and the distortion involved in a large deformation would undermine its calculation efficiency. An alternative method, the multi-body (MB) dynamics simulation provides satisfying time efficiency but limited accuracy when highly nonlinear dynamic process is involved. To maintain the advantages of both methods, this paper proposes a data-driven simulation framework for dynamics simulation of railway vehicles. This framework uses machine learning techniques to extract nonlinear features from training data generated by FE simulations so that specific mesh structures can be formulated by a surrogate element (or surrogate elements) to replace the original mechanical elements, and the dynamics simulation can be implemented by co-simulation with the surrogate element(s) embedded into a MB model. This framework consists of a series of techniques including data collection, feature extraction, training data sampling, surrogate element building, and model evaluation and selection. To verify the feasibility of this framework, we present two case studies, a vertical dynamics simulation and a longitudinal dynamics simulation, based on co-simulation with MATLAB/Simulink and Simpack, and a further comparison with a popular data-driven model (the Kriging model) is provided. The simulation result shows that using the legendre polynomial regression model in building surrogate elements can largely cut down the simulation time without sacrifice in accuracy.

  16. Photometric Data from Non-Resolved Objects for Space Object Characterization and Improved Atmospheric Modeling

    NASA Astrophysics Data System (ADS)

    Linares, R.; Palmer, D.; Thompson, D.; Koller, J.

    2013-09-01

    impact assessment via improved physics-based modeling. As part of this effort calibration satellite observations are used to dynamically calibrate the physics-based model and to improve its forecasting capability. The observations are collected from a variety of sources, including from LANL's own Raven-class optical telescope. This system collects both astrometric and photometric data on space objects. The photometric data will be used to estimate the space objects' attitude and shape. Non-resolved photometric data have been studied by many as a mechanism for space object characterization. Photometry is the measurement of an object's flux or apparent brightness measured over a wavelength band. The temporal variation of photometric measurements is referred to as photometric signature. The photometric optical signature of an object contains information about shape, attitude, size and material composition. This work focuses on the processing of the data collected with LANL's telescope in an effort to use photometric data to expand the number of space objects that can be used as calibration satellites. An Unscented Kalman filter is used to estimate the attitude and angular velocity of the space object; both real data and simulated data scenarios are shown. A number of inactive space objects are used for the real data examples and good estimation results are shown.

  17. Object-oriented model-driven control

    NASA Technical Reports Server (NTRS)

    Drysdale, A.; Mcroberts, M.; Sager, J.; Wheeler, R.

    1994-01-01

    A monitoring and control subsystem architecture has been developed that capitalizes on the use of modeldriven monitoring and predictive control, knowledge-based data representation, and artificial reasoning in an operator support mode. We have developed an object-oriented model of a Controlled Ecological Life Support System (CELSS). The model based on the NASA Kennedy Space Center CELSS breadboard data, tracks carbon, hydrogen, and oxygen, carbodioxide, and water. It estimates and tracks resorce-related parameters such as mass, energy, and manpower measurements such as growing area required for balance. We are developing an interface with the breadboard systems that is compatible with artificial reasoning. Initial work is being done on use of expert systems and user interface development. This paper presents an approach to defining universally applicable CELSS monitor and control issues, and implementing appropriate monitor and control capability for a particular instance: the KSC CELSS Breadboard Facility.

  18. Uncertainty-driven nuclear data evaluation including thermal (n,α) applied to 59Ni

    NASA Astrophysics Data System (ADS)

    Helgesson, P.; Sjöstrand, H.; Rochman, D.

    2017-11-01

    This paper presents a novel approach to the evaluation of nuclear data (ND), combining experimental data for thermal cross sections with resonance parameters and nuclear reaction modeling. The method involves sampling of various uncertain parameters, in particular uncertain components in experimental setups, and provides extensive covariance information, including consistent cross-channel correlations over the whole energy spectrum. The method is developed for, and applied to, 59Ni, but may be used as a whole, or in part, for other nuclides. 59Ni is particularly interesting since a substantial amount of 59Ni is produced in thermal nuclear reactors by neutron capture in 58Ni and since it has a non-threshold (n,α) cross section. Therefore, 59Ni gives a very important contribution to the helium production in stainless steel in a thermal reactor. However, current evaluated ND libraries contain old information for 59Ni, without any uncertainty information. The work includes a study of thermal cross section experiments and a novel combination of this experimental information, giving the full multivariate distribution of the thermal cross sections. In particular, the thermal (n,α) cross section is found to be 12.7 ± . 7 b. This is consistent with, but yet different from, current established values. Further, the distribution of thermal cross sections is combined with reported resonance parameters, and with TENDL-2015 data, to provide full random ENDF files; all of this is done in a novel way, keeping uncertainties and correlations in mind. The random files are also condensed into one single ENDF file with covariance information, which is now part of a beta version of JEFF 3.3. Finally, the random ENDF files have been processed and used in an MCNP model to study the helium production in stainless steel. The increase in the (n,α) rate due to 59Ni compared to fresh stainless steel is found to be a factor of 5.2 at a certain time in the reactor vessel, with a relative

  19. Data-driven classification of bipolar I disorder from longitudinal course of mood.

    PubMed

    Cochran, A L; McInnis, M G; Forger, D B

    2016-10-11

    The Diagnostic and Statistical Manual of Mental Disorder (DSM) classification of bipolar disorder defines categories to reflect common understanding of mood symptoms rather than scientific evidence. This work aimed to determine whether bipolar I can be objectively classified from longitudinal mood data and whether resulting classes have clinical associations. Bayesian nonparametric hierarchical models with latent classes and patient-specific models of mood are fit to data from Longitudinal Interval Follow-up Evaluations (LIFE) of bipolar I patients (N=209). Classes are tested for clinical associations. No classes are justified using the time course of DSM-IV mood states. Three classes are justified using the course of subsyndromal mood symptoms. Classes differed in attempted suicides (P=0.017), disability status (P=0.012) and chronicity of affective symptoms (P=0.009). Thus, bipolar I disorder can be objectively classified from mood course, and individuals in the resulting classes share clinical features. Data-driven classification from mood course could be used to enrich sample populations for pharmacological and etiological studies.

  20. Data-Driven Learning of Q-Matrix

    ERIC Educational Resources Information Center

    Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2012-01-01

    The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known "Q"-matrix, which specifies the item-attribute relationships. This article proposes a data-driven approach to identification of the "Q"-matrix and estimation of…

  1. The Potential of Knowing More: A Review of Data-Driven Urban Water Management.

    PubMed

    Eggimann, Sven; Mutzner, Lena; Wani, Omar; Schneider, Mariane Yvonne; Spuhler, Dorothee; Moy de Vitry, Matthew; Beutler, Philipp; Maurer, Max

    2017-03-07

    The promise of collecting and utilizing large amounts of data has never been greater in the history of urban water management (UWM). This paper reviews several data-driven approaches which play a key role in bringing forward a sea change. It critically investigates whether data-driven UWM offers a promising foundation for addressing current challenges and supporting fundamental changes in UWM. We discuss the examples of better rain-data management, urban pluvial flood-risk management and forecasting, drinking water and sewer network operation and management, integrated design and management, increasing water productivity, wastewater-based epidemiology and on-site water and wastewater treatment. The accumulated evidence from literature points toward a future UWM that offers significant potential benefits thanks to increased collection and utilization of data. The findings show that data-driven UWM allows us to develop and apply novel methods, to optimize the efficiency of the current network-based approach, and to extend functionality of today's systems. However, generic challenges related to data-driven approaches (e.g., data processing, data availability, data quality, data costs) and the specific challenges of data-driven UWM need to be addressed, namely data access and ownership, current engineering practices and the difficulty of assessing the cost benefits of data-driven UWM.

  2. Target volume and artifact evaluation of a new data-driven 4D CT.

    PubMed

    Martin, Rachael; Pan, Tinsu

    Four-dimensional computed tomography (4D CT) is often used to define the internal gross target volume (IGTV) for radiation therapy of lung cancer. Traditionally, this technique requires the use of an external motion surrogate; however, a new image, data-driven 4D CT, has become available. This study aims to describe this data-driven 4D CT and compare target contours created with it to those created using standard 4D CT. Cine CT data of 35 patients undergoing stereotactic body radiation therapy were collected and sorted into phases using standard and data-driven 4D CT. IGTV contours were drawn using a semiautomated method on maximum intensity projection images of both 4D CT methods. Errors resulting from reproducibility of the method were characterized. A comparison of phase image artifacts was made using a normalized cross-correlation method that assigned a score from +1 (data-driven "better") to -1 (standard "better"). The volume difference between the data-driven and standard IGTVs was not significant (data driven was 2.1 ± 1.0% smaller, P = .08). The Dice similarity coefficient showed good similarity between the contours (0.949 ± 0.006). The mean surface separation was 0.4 ± 0.1 mm and the Hausdorff distance was 3.1 ± 0.4 mm. An average artifact score of +0.37 indicated that the data-driven method had significantly fewer and/or less severe artifacts than the standard method (P = 1.5 × 10 -5 for difference from 0). On average, the difference between IGTVs derived from data-driven and standard 4D CT was not clinically relevant or statistically significant, suggesting data-driven 4D CT can be used in place of standard 4D CT without adjustments to IGTVs. The relatively large differences in some patients were usually attributed to limitations in automatic contouring or differences in artifacts. Artifact reduction and setup simplicity suggest a clinical advantage to data-driven 4D CT. Published by Elsevier Inc.

  3. Combining Knowledge and Data Driven Insights for Identifying Risk Factors using Electronic Health Records

    PubMed Central

    Sun, Jimeng; Hu, Jianying; Luo, Dijun; Markatou, Marianthi; Wang, Fei; Edabollahi, Shahram; Steinhubl, Steven E.; Daar, Zahra; Stewart, Walter F.

    2012-01-01

    Background: The ability to identify the risk factors related to an adverse condition, e.g., heart failures (HF) diagnosis, is very important for improving care quality and reducing cost. Existing approaches for risk factor identification are either knowledge driven (from guidelines or literatures) or data driven (from observational data). No existing method provides a model to effectively combine expert knowledge with data driven insight for risk factor identification. Methods: We present a systematic approach to enhance known knowledge-based risk factors with additional potential risk factors derived from data. The core of our approach is a sparse regression model with regularization terms that correspond to both knowledge and data driven risk factors. Results: The approach is validated using a large dataset containing 4,644 heart failure cases and 45,981 controls. The outpatient electronic health records (EHRs) for these patients include diagnosis, medication, lab results from 2003–2010. We demonstrate that the proposed method can identify complementary risk factors that are not in the existing known factors and can better predict the onset of HF. We quantitatively compare different sets of risk factors in the context of predicting onset of HF using the performance metric, the Area Under the ROC Curve (AUC). The combined risk factors between knowledge and data significantly outperform knowledge-based risk factors alone. Furthermore, those additional risk factors are confirmed to be clinically meaningful by a cardiologist. Conclusion: We present a systematic framework for combining knowledge and data driven insights for risk factor identification. We demonstrate the power of this framework in the context of predicting onset of HF, where our approach can successfully identify intuitive and predictive risk factors beyond a set of known HF risk factors. PMID:23304365

  4. Social Capital in Data-Driven Community College Reform

    ERIC Educational Resources Information Center

    Kerrigan, Monica Reid

    2015-01-01

    The current rhetoric around using data to improve community college student outcomes with only limited research on data-driven decision-making (DDDM) within postsecondary education compels a more comprehensive understanding of colleges' capacity for using data to inform decisions. Based on an analysis of faculty and administrators' perceptions and…

  5. Optimally Distributed Kalman Filtering with Data-Driven Communication †

    PubMed Central

    Dormann, Katharina

    2018-01-01

    For multisensor data fusion, distributed state estimation techniques that enable a local processing of sensor data are the means of choice in order to minimize storage and communication costs. In particular, a distributed implementation of the optimal Kalman filter has recently been developed. A significant disadvantage of this algorithm is that the fusion center needs access to each node so as to compute a consistent state estimate, which requires full communication each time an estimate is requested. In this article, different extensions of the optimally distributed Kalman filter are proposed that employ data-driven transmission schemes in order to reduce communication expenses. As a first relaxation of the full-rate communication scheme, it can be shown that each node only has to transmit every second time step without endangering consistency of the fusion result. Also, two data-driven algorithms are introduced that even allow for lower transmission rates, and bounds are derived to guarantee consistent fusion results. Simulations demonstrate that the data-driven distributed filtering schemes can outperform a centralized Kalman filter that requires each measurement to be sent to the center node. PMID:29596392

  6. Data-driven modelling of social forces and collective behaviour in zebrafish.

    PubMed

    Zienkiewicz, Adam K; Ladu, Fabrizio; Barton, David A W; Porfiri, Maurizio; Bernardo, Mario Di

    2018-04-14

    Zebrafish are rapidly emerging as a powerful model organism in hypothesis-driven studies targeting a number of functional and dysfunctional processes. Mathematical models of zebrafish behaviour can inform the design of experiments, through the unprecedented ability to perform pilot trials on a computer. At the same time, in-silico experiments could help refining the analysis of real data, by enabling the systematic investigation of key neurobehavioural factors. Here, we establish a data-driven model of zebrafish social interaction. Specifically, we derive a set of interaction rules to capture the primary response mechanisms which have been observed experimentally. Contrary to previous studies, we include dynamic speed regulation in addition to turning responses, which together provide attractive, repulsive and alignment interactions between individuals. The resulting multi-agent model provides a novel, bottom-up framework to describe both the spontaneous motion and individual-level interaction dynamics of zebrafish, inferred directly from experimental observations. Copyright © 2018 Elsevier Ltd. All rights reserved.

  7. Helioseismic and neutrino data-driven reconstruction of solar properties

    NASA Astrophysics Data System (ADS)

    Song, Ningqiang; Gonzalez-Garcia, M. C.; Villante, Francesco L.; Vinyoles, Nuria; Serenelli, Aldo

    2018-06-01

    In this work, we use Bayesian inference to quantitatively reconstruct the solar properties most relevant to the solar composition problem using as inputs the information provided by helioseismic and solar neutrino data. In particular, we use a Gaussian process to model the functional shape of the opacity uncertainty to gain flexibility and become as free as possible from prejudice in this regard. With these tools we first readdress the statistical significance of the solar composition problem. Furthermore, starting from a composition unbiased set of standard solar models (SSMs) we are able to statistically select those with solar chemical composition and other solar inputs which better describe the helioseismic and neutrino observations. In particular, we are able to reconstruct the solar opacity profile in a data-driven fashion, independently of any reference opacity tables, obtaining a 4 per cent uncertainty at the base of the convective envelope and 0.8 per cent at the solar core. When systematic uncertainties are included, results are 7.5 per cent and 2 per cent, respectively. In addition, we find that the values of most of the other inputs of the SSMs required to better describe the helioseismic and neutrino data are in good agreement with those adopted as the standard priors, with the exception of the astrophysical factor S11 and the microscopic diffusion rates, for which data suggests a 1 per cent and 30 per cent reduction, respectively. As an output of the study we derive the corresponding data-driven predictions for the solar neutrino fluxes.

  8. Direct match data flow machine apparatus and process for data driven computing

    DOEpatents

    Davidson, George S.; Grafe, Victor Gerald

    1997-01-01

    A data flow computer and method of computing is disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information form an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status but to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a "fire" signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor.

  9. Methods, systems and devices for detecting threatening objects and for classifying magnetic data

    DOEpatents

    Kotter, Dale K [Shelley, ID; Roybal, Lyle G [Idaho Falls, ID; Rohrbaugh, David T [Idaho Falls, ID; Spencer, David F [Idaho Falls, ID

    2012-01-24

    A method for detecting threatening objects in a security screening system. The method includes a step of classifying unique features of magnetic data as representing a threatening object. Another step includes acquiring magnetic data. Another step includes determining if the acquired magnetic data comprises a unique feature.

  10. Conditioning 3D object-based models to dense well data

    NASA Astrophysics Data System (ADS)

    Wang, Yimin C.; Pyrcz, Michael J.; Catuneanu, Octavian; Boisvert, Jeff B.

    2018-06-01

    Object-based stochastic simulation models are used to generate categorical variable models with a realistic representation of complicated reservoir heterogeneity. A limitation of object-based modeling is the difficulty of conditioning to dense data. One method to achieve data conditioning is to apply optimization techniques. Optimization algorithms can utilize an objective function measuring the conditioning level of each object while also considering the geological realism of the object. Here, an objective function is optimized with implicit filtering which considers constraints on object parameters. Thousands of objects conditioned to data are generated and stored in a database. A set of objects are selected with linear integer programming to generate the final realization and honor all well data, proportions and other desirable geological features. Although any parameterizable object can be considered, objects from fluvial reservoirs are used to illustrate the ability to simultaneously condition multiple types of geologic features. Channels, levees, crevasse splays and oxbow lakes are parameterized based on location, path, orientation and profile shapes. Functions mimicking natural river sinuosity are used for the centerline model. Channel stacking pattern constraints are also included to enhance the geological realism of object interactions. Spatial layout correlations between different types of objects are modeled. Three case studies demonstrate the flexibility of the proposed optimization-simulation method. These examples include multiple channels with high sinuosity, as well as fragmented channels affected by limited preservation. In all cases the proposed method reproduces input parameters for the object geometries and matches the dense well constraints. The proposed methodology expands the applicability of object-based simulation to complex and heterogeneous geological environments with dense sampling.

  11. Data-Driven H∞ Control for Nonlinear Distributed Parameter Systems.

    PubMed

    Luo, Biao; Huang, Tingwen; Wu, Huai-Ning; Yang, Xiong

    2015-11-01

    The data-driven H∞ control problem of nonlinear distributed parameter systems is considered in this paper. An off-policy learning method is developed to learn the H∞ control policy from real system data rather than the mathematical model. First, Karhunen-Loève decomposition is used to compute the empirical eigenfunctions, which are then employed to derive a reduced-order model (ROM) of slow subsystem based on the singular perturbation theory. The H∞ control problem is reformulated based on the ROM, which can be transformed to solve the Hamilton-Jacobi-Isaacs (HJI) equation, theoretically. To learn the solution of the HJI equation from real system data, a data-driven off-policy learning approach is proposed based on the simultaneous policy update algorithm and its convergence is proved. For implementation purpose, a neural network (NN)- based action-critic structure is developed, where a critic NN and two action NNs are employed to approximate the value function, control, and disturbance policies, respectively. Subsequently, a least-square NN weight-tuning rule is derived with the method of weighted residuals. Finally, the developed data-driven off-policy learning approach is applied to a nonlinear diffusion-reaction process, and the obtained results demonstrate its effectiveness.

  12. Keys to success for data-driven decision making: Lessons from participatory monitoring and collaborative adaptive management

    USDA-ARS?s Scientific Manuscript database

    Recent years have witnessed a call for evidence-based decisions in conservation and natural resource management, including data-driven decision-making. Adaptive management (AM) is one prevalent model for integrating scientific data into decision-making, yet AM has faced numerous challenges and limit...

  13. Data-driven models of dominantly-inherited Alzheimer's disease progression.

    PubMed

    Oxtoby, Neil P; Young, Alexandra L; Cash, David M; Benzinger, Tammie L S; Fagan, Anne M; Morris, John C; Bateman, Randall J; Fox, Nick C; Schott, Jonathan M; Alexander, Daniel C

    2018-05-01

    See Li and Donohue (doi:10.1093/brain/awy089) for a scientific commentary on this article.Dominantly-inherited Alzheimer's disease is widely hoped to hold the key to developing interventions for sporadic late onset Alzheimer's disease. We use emerging techniques in generative data-driven disease progression modelling to characterize dominantly-inherited Alzheimer's disease progression with unprecedented resolution, and without relying upon familial estimates of years until symptom onset. We retrospectively analysed biomarker data from the sixth data freeze of the Dominantly Inherited Alzheimer Network observational study, including measures of amyloid proteins and neurofibrillary tangles in the brain, regional brain volumes and cortical thicknesses, brain glucose hypometabolism, and cognitive performance from the Mini-Mental State Examination (all adjusted for age, years of education, sex, and head size, as appropriate). Data included 338 participants with known mutation status (211 mutation carriers in three subtypes: 163 PSEN1, 17 PSEN2, and 31 APP) and a baseline visit (age 19-66; up to four visits each, 1.1 ± 1.9 years in duration; spanning 30 years before, to 21 years after, parental age of symptom onset). We used an event-based model to estimate sequences of biomarker changes from baseline data across disease subtypes (mutation groups), and a differential equation model to estimate biomarker trajectories from longitudinal data (up to 66 mutation carriers, all subtypes combined). The two models concur that biomarker abnormality proceeds as follows: amyloid deposition in cortical then subcortical regions (∼24 ± 11 years before onset); phosphorylated tau (17 ± 8 years), tau and amyloid-β changes in cerebrospinal fluid; neurodegeneration first in the putamen and nucleus accumbens (up to 6 ± 2 years); then cognitive decline (7 ± 6 years), cerebral hypometabolism (4 ± 4 years), and further regional neurodegeneration. Our models predicted symptom onset more

  14. A metadata schema for data objects in clinical research.

    PubMed

    Canham, Steve; Ohmann, Christian

    2016-11-24

    A large number of stakeholders have accepted the need for greater transparency in clinical research and, in the context of various initiatives and systems, have developed a diverse and expanding number of repositories for storing the data and documents created by clinical studies (collectively known as data objects). To make the best use of such resources, we assert that it is also necessary for stakeholders to agree and deploy a simple, consistent metadata scheme. The relevant data objects and their likely storage are described, and the requirements for metadata to support data sharing in clinical research are identified. Issues concerning persistent identifiers, for both studies and data objects, are explored. A scheme is proposed that is based on the DataCite standard, with extensions to cover the needs of clinical researchers, specifically to provide (a) study identification data, including links to clinical trial registries; (b) data object characteristics and identifiers; and (c) data covering location, ownership and access to the data object. The components of the metadata scheme are described. The metadata schema is proposed as a natural extension of a widely agreed standard to fill a gap not tackled by other standards related to clinical research (e.g., Clinical Data Interchange Standards Consortium, Biomedical Research Integrated Domain Group). The proposal could be integrated with, but is not dependent on, other moves to better structure data in clinical research.

  15. Data-driven sensor placement from coherent fluid structures

    NASA Astrophysics Data System (ADS)

    Manohar, Krithika; Kaiser, Eurika; Brunton, Bingni W.; Kutz, J. Nathan; Brunton, Steven L.

    2017-11-01

    Optimal sensor placement is a central challenge in the prediction, estimation and control of fluid flows. We reinterpret sensor placement as optimizing discrete samples of coherent fluid structures for full state reconstruction. This permits a drastic reduction in the number of sensors required for faithful reconstruction, since complex fluid interactions can often be described by a small number of coherent structures. Our work optimizes point sensors using the pivoted matrix QR factorization to sample coherent structures directly computed from flow data. We apply this sampling technique in conjunction with various data-driven modal identification methods, including the proper orthogonal decomposition (POD) and dynamic mode decomposition (DMD). In contrast to POD-based sensors, DMD demonstrably enables the optimization of sensors for prediction in systems exhibiting multiple scales of dynamics. Finally, reconstruction accuracy from pivot sensors is shown to be competitive with sensors obtained using traditional computationally prohibitive optimization methods.

  16. Data-Driven Planning: Using Assessment in Strategic Planning

    ERIC Educational Resources Information Center

    Bresciani, Marilee J.

    2010-01-01

    Data-driven planning or evidence-based decision making represents nothing new in its concept. For years, business leaders have claimed they have implemented planning informed by data that have been strategically and systematically gathered. Within higher education and student affairs, there may be less evidence of the actual practice of…

  17. Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.

    PubMed

    Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A

    2006-10-15

    Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.

  18. Representing object oriented specifications and designs with extended data flow notations

    NASA Technical Reports Server (NTRS)

    Buser, Jon Franklin; Ward, Paul T.

    1988-01-01

    The issue of using extended data flow notations to document object oriented designs and specifications is discussed. Extended data flow notations, for the purposes here, refer to notations that are based on the rules of Yourdon/DeMarco data flow analysis. The extensions include additional notation for representing real-time systems as well as some proposed extensions specific to object oriented development. Some advantages of data flow notations are stated. How data flow diagrams are used to represent software objects are investigated. Some problem areas with regard to using data flow notations for object oriented development are noted. Some initial solutions to these problems are proposed.

  19. External radioactive markers for PET data-driven respiratory gating in positron emission tomography.

    PubMed

    Büther, Florian; Ernst, Iris; Hamill, James; Eich, Hans T; Schober, Otmar; Schäfers, Michael; Schäfers, Klaus P

    2013-04-01

    Respiratory gating is an established approach to overcoming respiration-induced image artefacts in PET. Of special interest in this respect are raw PET data-driven gating methods which do not require additional hardware to acquire respiratory signals during the scan. However, these methods rely heavily on the quality of the acquired PET data (statistical properties, data contrast, etc.). We therefore combined external radioactive markers with data-driven respiratory gating in PET/CT. The feasibility and accuracy of this approach was studied for [(18)F]FDG PET/CT imaging in patients with malignant liver and lung lesions. PET data from 30 patients with abdominal or thoracic [(18)F]FDG-positive lesions (primary tumours or metastases) were included in this prospective study. The patients underwent a 10-min list-mode PET scan with a single bed position following a standard clinical whole-body [(18)F]FDG PET/CT scan. During this scan, one to three radioactive point sources (either (22)Na or (18)F, 50-100 kBq) in a dedicated holder were attached the patient's abdomen. The list mode data acquired were retrospectively analysed for respiratory signals using established data-driven gating approaches and additionally by tracking the motion of the point sources in sinogram space. Gated reconstructions were examined qualitatively, in terms of the amount of respiratory displacement and in respect of changes in local image intensity in the gated images. The presence of the external markers did not affect whole-body PET/CT image quality. Tracking of the markers led to characteristic respiratory curves in all patients. Applying these curves for gated reconstructions resulted in images in which motion was well resolved. Quantitatively, the performance of the external marker-based approach was similar to that of the best intrinsic data-driven methods. Overall, the gain in measured tumour uptake from the nongated to the gated images indicating successful removal of respiratory motion

  20. Enabling Data-Driven Methodologies Across the Data Lifecycle and Ecosystem

    NASA Astrophysics Data System (ADS)

    Doyle, R. J.; Crichton, D.

    2017-12-01

    NASA has unlocked unprecedented scientific knowledge through exploration of the Earth, our solar system, and the larger universe. NASA is generating enormous amounts of data that are challenging traditional approaches to capturing, managing, analyzing and ultimately gaining scientific understanding from science data. New architectures, capabilities and methodologies are needed to span the entire observing system, from spacecraft to archive, while integrating data-driven discovery and analytic capabilities. NASA data have a definable lifecycle, from remote collection point to validated accessibility in multiple archives. Data challenges must be addressed across this lifecycle, to capture opportunities and avoid decisions that may limit or compromise what is achievable once data arrives at the archive. Data triage may be necessary when the collection capacity of the sensor or instrument overwhelms data transport or storage capacity. By migrating computational and analytic capability to the point of data collection, informed decisions can be made about which data to keep; in some cases, to close observational decision loops onboard, to enable attending to unexpected or transient phenomena. Along a different dimension than the data lifecycle, scientists and other end-users must work across an increasingly complex data ecosystem, where the range of relevant data is rarely owned by a single institution. To operate effectively, scalable data architectures and community-owned information models become essential. NASA's Planetary Data System is having success with this approach. Finally, there is the difficult challenge of reproducibility and trust. While data provenance techniques will be part of the solution, future interactive analytics environments must support an ability to provide a basis for a result: relevant data source and algorithms, uncertainty tracking, etc., to assure scientific integrity and to enable confident decision making. Advances in data science offer

  1. Direct match data flow machine apparatus and process for data driven computing

    DOEpatents

    Davidson, G.S.; Grafe, V.G.

    1997-08-12

    A data flow computer and method of computing are disclosed which utilizes a data driven processor node architecture. The apparatus in a preferred embodiment includes a plurality of First-In-First-Out (FIFO) registers, a plurality of related data flow memories, and a processor. The processor makes the necessary calculations and includes a control unit to generate signals to enable the appropriate FIFO register receiving the result. In a particular embodiment, there are three FIFO registers per node: an input FIFO register to receive input information form an outside source and provide it to the data flow memories; an output FIFO register to provide output information from the processor to an outside recipient; and an internal FIFO register to provide information from the processor back to the data flow memories. The data flow memories are comprised of four commonly addressed memories. A parameter memory holds the A and B parameters used in the calculations; an opcode memory holds the instruction; a target memory holds the output address; and a tag memory contains status bits for each parameter. One status bit indicates whether the corresponding parameter is in the parameter memory and one status but to indicate whether the stored information in the corresponding data parameter is to be reused. The tag memory outputs a ``fire`` signal (signal R VALID) when all of the necessary information has been stored in the data flow memories, and thus when the instruction is ready to be fired to the processor. 11 figs.

  2. Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

    NASA Astrophysics Data System (ADS)

    Wang, Lijuan; Yan, Yong; Wang, Xue; Wang, Tao

    2017-03-01

    Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including partial mutual information (PMI), genetic algorithm-artificial neural network (GA-ANN) and tree-based iterative input selection (IIS) are applied in this study. Typical data-driven models incorporating support vector machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction.

  3. Data-Driven Diffusion Of Innovations: Successes And Challenges In 3 Large-Scale Innovative Delivery Models

    PubMed Central

    Dorr, David A.; Cohen, Deborah J.; Adler-Milstein, Julia

    2018-01-01

    Failed diffusion of innovations may be linked to an inability to use and apply data, information, and knowledge to change perceptions of current practice and motivate change. Using qualitative and quantitative data from three large-scale health care delivery innovations—accountable care organizations, advanced primary care practice, and EvidenceNOW—we assessed where data-driven innovation is occurring and where challenges lie. We found that implementation of some technological components of innovation (for example, electronic health records) has occurred among health care organizations, but core functions needed to use data to drive innovation are lacking. Deficits include the inability to extract and aggregate data from the records; gaps in sharing data; and challenges in adopting advanced data functions, particularly those related to timely reporting of performance data. The unexpectedly high costs and burden incurred during implementation of the innovations have limited organizations’ ability to address these and other deficits. Solutions that could help speed progress in data-driven innovation include facilitating peer-to-peer technical assistance, providing tailored feedback reports to providers from data aggregators, and using practice facilitators skilled in using data technology for quality improvement to help practices transform. Policy efforts that promote these solutions may enable more rapid uptake of and successful participation in innovative delivery system reforms. PMID:29401031

  4. Data-Driven Diffusion Of Innovations: Successes And Challenges In 3 Large-Scale Innovative Delivery Models.

    PubMed

    Dorr, David A; Cohen, Deborah J; Adler-Milstein, Julia

    2018-02-01

    Failed diffusion of innovations may be linked to an inability to use and apply data, information, and knowledge to change perceptions of current practice and motivate change. Using qualitative and quantitative data from three large-scale health care delivery innovations-accountable care organizations, advanced primary care practice, and EvidenceNOW-we assessed where data-driven innovation is occurring and where challenges lie. We found that implementation of some technological components of innovation (for example, electronic health records) has occurred among health care organizations, but core functions needed to use data to drive innovation are lacking. Deficits include the inability to extract and aggregate data from the records; gaps in sharing data; and challenges in adopting advanced data functions, particularly those related to timely reporting of performance data. The unexpectedly high costs and burden incurred during implementation of the innovations have limited organizations' ability to address these and other deficits. Solutions that could help speed progress in data-driven innovation include facilitating peer-to-peer technical assistance, providing tailored feedback reports to providers from data aggregators, and using practice facilitators skilled in using data technology for quality improvement to help practices transform. Policy efforts that promote these solutions may enable more rapid uptake of and successful participation in innovative delivery system reforms.

  5. Examining Data-Driven Decision Making in Private/Religious Schools

    ERIC Educational Resources Information Center

    Hanks, Jason Edward

    2011-01-01

    The purpose of this study was to investigate non-mandated data-driven decision making in private/religious schools. The school culture support of data use, teacher use of data, leader facilitation of using data, and the availability of data were investigated in three schools. A quantitative survey research design was used to explore the research…

  6. Data-driven train set crash dynamics simulation

    NASA Astrophysics Data System (ADS)

    Tang, Zhao; Zhu, Yunrui; Nie, Yinyu; Guo, Shihui; Liu, Fengjia; Chang, Jian; Zhang, Jianjun

    2017-02-01

    Traditional finite element (FE) methods are arguably expensive in computation/simulation of the train crash. High computational cost limits their direct applications in investigating dynamic behaviours of an entire train set for crashworthiness design and structural optimisation. On the contrary, multi-body modelling is widely used because of its low computational cost with the trade-off in accuracy. In this study, a data-driven train crash modelling method is proposed to improve the performance of a multi-body dynamics simulation of train set crash without increasing the computational burden. This is achieved by the parallel random forest algorithm, which is a machine learning approach that extracts useful patterns of force-displacement curves and predicts a force-displacement relation in a given collision condition from a collection of offline FE simulation data on various collision conditions, namely different crash velocities in our analysis. Using the FE simulation results as a benchmark, we compared our method with traditional multi-body modelling methods and the result shows that our data-driven method improves the accuracy over traditional multi-body models in train crash simulation and runs at the same level of efficiency.

  7. Object-oriented approach to fast display of electrophysiological data under MS-windows.

    PubMed

    Marion-Poll, F

    1995-12-01

    Microcomputers provide neuroscientists an alternative to a host of laboratory equipment to record and analyze electrophysiological data. Object-oriented programming tools bring an essential link between custom needs for data acquisition and analysis with general software packages. In this paper, we outline the layout of basic objects that display and manipulate electrophysiological data files. Visual inspection of the recordings is a basic requirement of any data analysis software. We present an approach that allows flexible and fast display of large data sets. This approach involves constructing an intermediate representation of the data in order to lower the number of actual points displayed while preserving the aspect of the data. The second group of objects is related to the management of lists of data files. Typical experiments designed to test the biological activity of pharmacological products include scores of files. Data manipulation and analysis are facilitated by creating multi-document objects that include the names of all experiment files. Implementation steps of both objects are described for an MS-Windows hosted application.

  8. Enhancing Extensive Reading with Data-Driven Learning

    ERIC Educational Resources Information Center

    Hadley, Gregory; Charles, Maggie

    2017-01-01

    This paper investigates using data-driven learning (DDL) as a means of stimulating greater lexicogrammatical knowledge and reading speed among lower proficiency learners in an extensive reading program. For 16 weekly 90-minute sessions, an experimental group (12 students) used DDL materials created from a corpus developed from the Oxford Bookworms…

  9. Data-driven approaches in the investigation of social perception

    PubMed Central

    Adolphs, Ralph; Nummenmaa, Lauri; Todorov, Alexander; Haxby, James V.

    2016-01-01

    The complexity of social perception poses a challenge to traditional approaches to understand its psychological and neurobiological underpinnings. Data-driven methods are particularly well suited to tackling the often high-dimensional nature of stimulus spaces and of neural representations that characterize social perception. Such methods are more exploratory, capitalize on rich and large datasets, and attempt to discover patterns often without strict hypothesis testing. We present four case studies here: behavioural studies on face judgements, two neuroimaging studies of movies, and eyetracking studies in autism. We conclude with suggestions for particular topics that seem ripe for data-driven approaches, as well as caveats and limitations. PMID:27069045

  10. Parallel object-oriented data mining system

    DOEpatents

    Kamath, Chandrika; Cantu-Paz, Erick

    2004-01-06

    A data mining system uncovers patterns, associations, anomalies and other statistically significant structures in data. Data files are read and displayed. Objects in the data files are identified. Relevant features for the objects are extracted. Patterns among the objects are recognized based upon the features. Data from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) sky survey was used to search for bent doubles. This test was conducted on data from the Very Large Array in New Mexico which seeks to locate a special type of quasar (radio-emitting stellar object) called bent doubles. The FIRST survey has generated more than 32,000 images of the sky to date. Each image is 7.1 megabytes, yielding more than 100 gigabytes of image data in the entire data set.

  11. Data-driven region-of-interest selection without inflating Type I error rate.

    PubMed

    Brooks, Joseph L; Zoumpoulaki, Alexia; Bowman, Howard

    2017-01-01

    In ERP and other large multidimensional neuroscience data sets, researchers often select regions of interest (ROIs) for analysis. The method of ROI selection can critically affect the conclusions of a study by causing the researcher to miss effects in the data or to detect spurious effects. In practice, to avoid inflating Type I error rate (i.e., false positives), ROIs are often based on a priori hypotheses or independent information. However, this can be insensitive to experiment-specific variations in effect location (e.g., latency shifts) reducing power to detect effects. Data-driven ROI selection, in contrast, is nonindependent and uses the data under analysis to determine ROI positions. Therefore, it has potential to select ROIs based on experiment-specific information and increase power for detecting effects. However, data-driven methods have been criticized because they can substantially inflate Type I error rate. Here, we demonstrate, using simulations of simple ERP experiments, that data-driven ROI selection can indeed be more powerful than a priori hypotheses or independent information. Furthermore, we show that data-driven ROI selection using the aggregate grand average from trials (AGAT), despite being based on the data at hand, can be safely used for ROI selection under many circumstances. However, when there is a noise difference between conditions, using the AGAT can inflate Type I error and should be avoided. We identify critical assumptions for use of the AGAT and provide a basis for researchers to use, and reviewers to assess, data-driven methods of ROI localization in ERP and other studies. © 2016 Society for Psychophysiological Research.

  12. Audiologist-driven versus patient-driven fine tuning of hearing instruments.

    PubMed

    Boymans, Monique; Dreschler, Wouter A

    2012-03-01

    Two methods of fine tuning the initial settings of hearing aids were compared: An audiologist-driven approach--using real ear measurements and a patient-driven fine-tuning approach--using feedback from real-life situations. The patient-driven fine tuning was conducted by employing the Amplifit(®) II system using audiovideo clips. The audiologist-driven fine tuning was based on the NAL-NL1 prescription rule. Both settings were compared using the same hearing aids in two 6-week trial periods following a randomized blinded cross-over design. After each trial period, the settings were evaluated by insertion-gain measurements. Performance was evaluated by speech tests in quiet, in noise, and in time-reversed speech, presented at 0° and with spatially separated sound sources. Subjective results were evaluated using extensive questionnaires and audiovisual video clips. A total of 73 participants were included. On average, higher gain values were found for the audiologist-driven settings than for the patient-driven settings, especially at 1000 and 2000 Hz. Better objective performance was obtained for the audiologist-driven settings for speech perception in quiet and in time-reversed speech. This was supported by better scores on a number of subjective judgments and in the subjective ratings of video clips. The perception of loud sounds scored higher than when patient-driven, but the overall preference was in favor of the audiologist-driven settings for 67% of the participants.

  13. Data-Driven Information Extraction from Chinese Electronic Medical Records

    PubMed Central

    Zhao, Tianwan; Ge, Chen; Gao, Weiguo; Wei, Jia; Zhu, Kenny Q.

    2015-01-01

    Objective This study aims to propose a data-driven framework that takes unstructured free text narratives in Chinese Electronic Medical Records (EMRs) as input and converts them into structured time-event-description triples, where the description is either an elaboration or an outcome of the medical event. Materials and Methods Our framework uses a hybrid approach. It consists of constructing cross-domain core medical lexica, an unsupervised, iterative algorithm to accrue more accurate terms into the lexica, rules to address Chinese writing conventions and temporal descriptors, and a Support Vector Machine (SVM) algorithm that innovatively utilizes Normalized Google Distance (NGD) to estimate the correlation between medical events and their descriptions. Results The effectiveness of the framework was demonstrated with a dataset of 24,817 de-identified Chinese EMRs. The cross-domain medical lexica were capable of recognizing terms with an F1-score of 0.896. 98.5% of recorded medical events were linked to temporal descriptors. The NGD SVM description-event matching achieved an F1-score of 0.874. The end-to-end time-event-description extraction of our framework achieved an F1-score of 0.846. Discussion In terms of named entity recognition, the proposed framework outperforms state-of-the-art supervised learning algorithms (F1-score: 0.896 vs. 0.886). In event-description association, the NGD SVM is superior to SVM using only local context and semantic features (F1-score: 0.874 vs. 0.838). Conclusions The framework is data-driven, weakly supervised, and robust against the variations and noises that tend to occur in a large corpus. It addresses Chinese medical writing conventions and variations in writing styles through patterns used for discovering new terms and rules for updating the lexica. PMID:26295801

  14. Safety analysis of proposed data-driven physiologic alarm parameters for hospitalized children.

    PubMed

    Goel, Veena V; Poole, Sarah F; Longhurst, Christopher A; Platchek, Terry S; Pageler, Natalie M; Sharek, Paul J; Palma, Jonathan P

    2016-12-01

    Modification of alarm limits is one approach to mitigating alarm fatigue. We aimed to create and validate heart rate (HR) and respiratory rate (RR) percentiles for hospitalized children, and analyze the safety of replacing current vital sign reference ranges with proposed data-driven, age-stratified 5th and 95th percentile values. In this retrospective cross-sectional study, nurse-charted HR and RR data from a training set of 7202 hospitalized children were used to develop percentile tables. We compared 5th and 95th percentile values with currently accepted reference ranges in a validation set of 2287 patients. We analyzed 148 rapid response team (RRT) and cardiorespiratory arrest (CRA) events over a 12-month period, using HR and RR values in the 12 hours prior to the event, to determine the proportion of patients with out-of-range vitals based upon reference versus data-driven limits. There were 24,045 (55.6%) fewer out-of-range measurements using data-driven vital sign limits. Overall, 144/148 RRT and CRA patients had out-of-range HR or RR values preceding the event using current limits, and 138/148 were abnormal using data-driven limits. Chart review of RRT and CRA patients with abnormal HR and RR per current limits considered normal by data-driven limits revealed that clinical status change was identified by other vital sign abnormalities or clinical context. A large proportion of vital signs in hospitalized children are outside presently used norms. Safety evaluation of data-driven limits suggests they are as safe as those currently used. Implementation of these parameters in physiologic monitors may mitigate alarm fatigue. Journal of Hospital Medicine 2015;11:817-823. © 2015 Society of Hospital Medicine. © 2016 Society of Hospital Medicine.

  15. CEBS object model for systems biology data, SysBio-OM.

    PubMed

    Xirasagar, Sandhya; Gustafson, Scott; Merrick, B Alex; Tomer, Kenneth B; Stasiewicz, Stanley; Chan, Denny D; Yost, Kenneth J; Yates, John R; Sumner, Susan; Xiao, Nianqing; Waters, Michael D

    2004-09-01

    To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http

  16. Big Data: An Opportunity for Collaboration with Computer Scientists on Data-Driven Science

    NASA Astrophysics Data System (ADS)

    Baru, C.

    2014-12-01

    Big data technologies are evolving rapidly, driven by the need to manage ever increasing amounts of historical data; process relentless streams of human and machine-generated data; and integrate data of heterogeneous structure from extremely heterogeneous sources of information. Big data is inherently an application-driven problem. Developing the right technologies requires an understanding of the applications domain. Though, an intriguing aspect of this phenomenon is that the availability of the data itself enables new applications not previously conceived of! In this talk, we will discuss how the big data phenomenon creates an imperative for collaboration among domain scientists (in this case, geoscientists) and computer scientists. Domain scientists provide the application requirements as well as insights about the data involved, while computer scientists help assess whether problems can be solved with currently available technologies or require adaptaion of existing technologies and/or development of new technologies. The synergy can create vibrant collaborations potentially leading to new science insights as well as development of new data technologies and systems. The area of interface between geosciences and computer science, also referred to as geoinformatics is, we believe, a fertile area for interdisciplinary research.

  17. Data Based Instruction in Reading

    ERIC Educational Resources Information Center

    Ediger, Marlow

    2010-01-01

    Data based instruction has received much attention in educational literature. It relates well to measurement driven teaching and learning. Data may come from several sources including mandated tests, district wide testing, formative and summative evaluations, as well as teacher written tests. Objective information is intended for use in data based…

  18. Metadata-Driven SOA-Based Application for Facilitation of Real-Time Data Warehousing

    NASA Astrophysics Data System (ADS)

    Pintar, Damir; Vranić, Mihaela; Skočir, Zoran

    Service-oriented architecture (SOA) has already been widely recognized as an effective paradigm for achieving integration of diverse information systems. SOA-based applications can cross boundaries of platforms, operation systems and proprietary data standards, commonly through the usage of Web Services technology. On the other side, metadata is also commonly referred to as a potential integration tool given the fact that standardized metadata objects can provide useful information about specifics of unknown information systems with which one has interest in communicating with, using an approach commonly called "model-based integration". This paper presents the result of research regarding possible synergy between those two integration facilitators. This is accomplished with a vertical example of a metadata-driven SOA-based business process that provides ETL (Extraction, Transformation and Loading) and metadata services to a data warehousing system in need of a real-time ETL support.

  19. Algebraic reasoning for the enhancement of data-driven building reconstructions

    NASA Astrophysics Data System (ADS)

    Meidow, Jochen; Hammer, Horst

    2016-04-01

    Data-driven approaches for the reconstruction of buildings feature the flexibility needed to capture objects of arbitrary shape. To recognize man-made structures, geometric relations such as orthogonality or parallelism have to be detected. These constraints are typically formulated as sets of multivariate polynomials. For the enforcement of the constraints within an adjustment process, a set of independent and consistent geometric constraints has to be determined. Gröbner bases are an ideal tool to identify such sets exactly. A complete workflow for geometric reasoning is presented to obtain boundary representations of solids based on given point clouds. The constraints are formulated in homogeneous coordinates, which results in simple polynomials suitable for the successful derivation of Gröbner bases for algebraic reasoning. Strategies for the reduction of the algebraical complexity are presented. To enforce the constraints, an adjustment model is introduced, which is able to cope with homogeneous coordinates along with their singular covariance matrices. The feasibility and the potential of the approach are demonstrated by the analysis of a real data set.

  20. General Purpose Data-Driven Online System Health Monitoring with Applications to Space Operations

    NASA Technical Reports Server (NTRS)

    Iverson, David L.; Spirkovska, Lilly; Schwabacher, Mark

    2010-01-01

    Modern space transportation and ground support system designs are becoming increasingly sophisticated and complex. Determining the health state of these systems using traditional parameter limit checking, or model-based or rule-based methods is becoming more difficult as the number of sensors and component interactions grows. Data-driven monitoring techniques have been developed to address these issues by analyzing system operations data to automatically characterize normal system behavior. System health can be monitored by comparing real-time operating data with these nominal characterizations, providing detection of anomalous data signatures indicative of system faults, failures, or precursors of significant failures. The Inductive Monitoring System (IMS) is a general purpose, data-driven system health monitoring software tool that has been successfully applied to several aerospace applications and is under evaluation for anomaly detection in vehicle and ground equipment for next generation launch systems. After an introduction to IMS application development, we discuss these NASA online monitoring applications, including the integration of IMS with complementary model-based and rule-based methods. Although the examples presented in this paper are from space operations applications, IMS is a general-purpose health-monitoring tool that is also applicable to power generation and transmission system monitoring.

  1. Central Office Data-Driven Decision Making in Public Education

    ERIC Educational Resources Information Center

    Scheikl, Oskar F.

    2009-01-01

    Data-driven decision making has become part of the lexicon for educational reform efforts. Supported by the federal No Child Left Behind legislation, the use of data to inform educational decisions has become a common-place practice across the country. Using an online survey administered to central office data leaders in all Virginia public school…

  2. Data-Driven Decision Making--Not Just a Buzz Word

    ERIC Educational Resources Information Center

    Kadel, Rob

    2010-01-01

    In education, data-driven decision making is a buzz word that has come to mean collecting absolutely as much data as possible on everything from attendance to zero tolerance, and then having absolutely no idea what to do with it. Most educational organizations with a plethora of data usually call in a data miner, or evaluator, to make some sense…

  3. The Cannon: A data-driven approach to Stellar Label Determination

    NASA Astrophysics Data System (ADS)

    Ness, M.; Hogg, David W.; Rix, H.-W.; Ho, Anna. Y. Q.; Zasowski, G.

    2015-07-01

    New spectroscopic surveys offer the promise of stellar parameters and abundances (“stellar labels”) for hundreds of thousands of stars; this poses a formidable spectral modeling challenge. In many cases, there is a subset of reference objects for which the stellar labels are known with high(er) fidelity. We take advantage of this with The Cannon, a new data-driven approach for determining stellar labels from spectroscopic data. The Cannon learns from the “known” labels of reference stars how the continuum-normalized spectra depend on these labels by fitting a flexible model at each wavelength; then, The Cannon uses this model to derive labels for the remaining survey stars. We illustrate The Cannon by training the model on only 542 stars in 19 clusters as reference objects, with {T}{eff}, {log} g, and [{Fe}/{{H}}] as the labels, and then applying it to the spectra of 55,000 stars from APOGEE DR10. The Cannon is very accurate. Its stellar labels compare well to the stars for which APOGEE pipeline (ASPCAP) labels are provided in DR10, with rms differences that are basically identical to the stated ASPCAP uncertainties. Beyond the reference labels, The Cannon makes no use of stellar models nor any line-list, but needs a set of reference objects that span label-space. The Cannon performs well at lower signal-to-noise, as it delivers comparably good labels even at one-ninth the APOGEE observing time. We discuss the limitations of The Cannon and its future potential, particularly, to bring different spectroscopic surveys onto a consistent scale of stellar labels.

  4. An ISA-TAB-Nano based data collection framework to support data-driven modelling of nanotoxicology.

    PubMed

    Marchese Robinson, Richard L; Cronin, Mark T D; Richarz, Andrea-Nicole; Rallo, Robert

    2015-01-01

    Analysis of trends in nanotoxicology data and the development of data driven models for nanotoxicity is facilitated by the reporting of data using a standardised electronic format. ISA-TAB-Nano has been proposed as such a format. However, in order to build useful datasets according to this format, a variety of issues has to be addressed. These issues include questions regarding exactly which (meta)data to report and how to report them. The current article discusses some of the challenges associated with the use of ISA-TAB-Nano and presents a set of resources designed to facilitate the manual creation of ISA-TAB-Nano datasets from the nanotoxicology literature. These resources were developed within the context of the NanoPUZZLES EU project and include data collection templates, corresponding business rules that extend the generic ISA-TAB-Nano specification as well as Python code to facilitate parsing and integration of these datasets within other nanoinformatics resources. The use of these resources is illustrated by a "Toy Dataset" presented in the Supporting Information. The strengths and weaknesses of the resources are discussed along with possible future developments.

  5. An ISA-TAB-Nano based data collection framework to support data-driven modelling of nanotoxicology

    PubMed Central

    Marchese Robinson, Richard L; Richarz, Andrea-Nicole; Rallo, Robert

    2015-01-01

    Summary Analysis of trends in nanotoxicology data and the development of data driven models for nanotoxicity is facilitated by the reporting of data using a standardised electronic format. ISA-TAB-Nano has been proposed as such a format. However, in order to build useful datasets according to this format, a variety of issues has to be addressed. These issues include questions regarding exactly which (meta)data to report and how to report them. The current article discusses some of the challenges associated with the use of ISA-TAB-Nano and presents a set of resources designed to facilitate the manual creation of ISA-TAB-Nano datasets from the nanotoxicology literature. These resources were developed within the context of the NanoPUZZLES EU project and include data collection templates, corresponding business rules that extend the generic ISA-TAB-Nano specification as well as Python code to facilitate parsing and integration of these datasets within other nanoinformatics resources. The use of these resources is illustrated by a “Toy Dataset” presented in the Supporting Information. The strengths and weaknesses of the resources are discussed along with possible future developments. PMID:26665069

  6. Data-Driven Model Uncertainty Estimation in Hydrologic Data Assimilation

    NASA Astrophysics Data System (ADS)

    Pathiraja, S.; Moradkhani, H.; Marshall, L.; Sharma, A.; Geenens, G.

    2018-02-01

    The increasing availability of earth observations necessitates mathematical methods to optimally combine such data with hydrologic models. Several algorithms exist for such purposes, under the umbrella of data assimilation (DA). However, DA methods are often applied in a suboptimal fashion for complex real-world problems, due largely to several practical implementation issues. One such issue is error characterization, which is known to be critical for a successful assimilation. Mischaracterized errors lead to suboptimal forecasts, and in the worst case, to degraded estimates even compared to the no assimilation case. Model uncertainty characterization has received little attention relative to other aspects of DA science. Traditional methods rely on subjective, ad hoc tuning factors or parametric distribution assumptions that may not always be applicable. We propose a novel data-driven approach (named SDMU) to model uncertainty characterization for DA studies where (1) the system states are partially observed and (2) minimal prior knowledge of the model error processes is available, except that the errors display state dependence. It includes an approach for estimating the uncertainty in hidden model states, with the end goal of improving predictions of observed variables. The SDMU is therefore suited to DA studies where the observed variables are of primary interest. Its efficacy is demonstrated through a synthetic case study with low-dimensional chaotic dynamics and a real hydrologic experiment for one-day-ahead streamflow forecasting. In both experiments, the proposed method leads to substantial improvements in the hidden states and observed system outputs over a standard method involving perturbation with Gaussian noise.

  7. Personalized mortality prediction driven by electronic medical data and a patient similarity metric.

    PubMed

    Lee, Joon; Maslove, David M; Dubin, Joel A

    2015-01-01

    Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our novel medical data analytics contributes to

  8. Interviewing Objects: Including Educational Technologies as Qualitative Research Participants

    ERIC Educational Resources Information Center

    Adams, Catherine A.; Thompson, Terrie Lynn

    2011-01-01

    This article argues the importance of including significant technologies-in-use as key qualitative research participants when studying today's digitally enhanced learning environments. We gather a set of eight heuristics to assist qualitative researchers in "interviewing" technologies-in-use (or other relevant objects), drawing on concrete…

  9. Data to Decisions: Creating a Culture of Model-Driven Drug Discovery.

    PubMed

    Brown, Frank K; Kopti, Farida; Chang, Charlie Zhenyu; Johnson, Scott A; Glick, Meir; Waller, Chris L

    2017-09-01

    Merck & Co., Inc., Kenilworth, NJ, USA, is undergoing a transformation in the way that it prosecutes R&D programs. Through the adoption of a "model-driven" culture, enhanced R&D productivity is anticipated, both in the form of decreased attrition at each stage of the process and by providing a rational framework for understanding and learning from the data generated along the way. This new approach focuses on the concept of a "Design Cycle" that makes use of all the data possible, internally and externally, to drive decision-making. These data can take the form of bioactivity, 3D structures, genomics, pathway, PK/PD, safety data, etc. Synthesis of high-quality data into models utilizing both well-established and cutting-edge methods has been shown to yield high confidence predictions to prioritize decision-making and efficiently reposition resources within R&D. The goal is to design an adaptive research operating plan that uses both modeled data and experiments, rather than just testing, to drive project decision-making. To support this emerging culture, an ambitious information management (IT) program has been initiated to implement a harmonized platform to facilitate the construction of cross-domain workflows to enable data-driven decision-making and the construction and validation of predictive models. These goals are achieved through depositing model-ready data, agile persona-driven access to data, a unified cross-domain predictive model lifecycle management platform, and support for flexible scientist-developed workflows that simplify data manipulation and consume model services. The end-to-end nature of the platform, in turn, not only supports but also drives the culture change by enabling scientists to apply predictive sciences throughout their work and over the lifetime of a project. This shift in mindset for both scientists and IT was driven by an early impactful demonstration of the potential benefits of the platform, in which expert-level early discovery

  10. Data-driven Analysis and Prediction of Arctic Sea Ice

    NASA Astrophysics Data System (ADS)

    Kondrashov, D. A.; Chekroun, M.; Ghil, M.; Yuan, X.; Ting, M.

    2015-12-01

    We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales.This approach is applied to monthly time series of leading principal components from the multivariate Empirical Orthogonal Function decomposition of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" forup to 6-months ahead. It will be shown in particular that the memory effects included in our non-Markovian linear MSM models improve predictions of large-amplitude SIC anomalies in certain Arctic regions. Furtherimprovements allowed by the MSM framework will adopt a nonlinear formulation, as well as alternative data-adaptive decompositions.

  11. Data-Driven Asthma Endotypes Defined from Blood Biomarker and Gene Expression Data

    PubMed Central

    George, Barbara Jane; Reif, David M.; Gallagher, Jane E.; Williams-DeVane, ClarLynda R.; Heidenfelder, Brooke L.; Hudgens, Edward E.; Jones, Wendell; Neas, Lucas; Hubal, Elaine A. Cohen; Edwards, Stephen W.

    2015-01-01

    The diagnosis and treatment of childhood asthma is complicated by its mechanistically distinct subtypes (endotypes) driven by genetic susceptibility and modulating environmental factors. Clinical biomarkers and blood gene expression were collected from a stratified, cross-sectional study of asthmatic and non-asthmatic children from Detroit, MI. This study describes four distinct asthma endotypes identified via a purely data-driven method. Our method was specifically designed to integrate blood gene expression and clinical biomarkers in a way that provides new mechanistic insights regarding the different asthma endotypes. For example, we describe metabolic syndrome-induced systemic inflammation as an associated factor in three of the four asthma endotypes. Context provided by the clinical biomarker data was essential in interpreting gene expression patterns and identifying putative endotypes, which emphasizes the importance of integrated approaches when studying complex disease etiologies. These synthesized patterns of gene expression and clinical markers from our research may lead to development of novel serum-based biomarker panels. PMID:25643280

  12. Impact of Data-driven Respiratory Gating in Clinical PET.

    PubMed

    Büther, Florian; Vehren, Thomas; Schäfers, Klaus P; Schäfers, Michael

    2016-10-01

    Purpose To study the feasibility and impact of respiratory gating in positron emission tomographic (PET) imaging in a clinical trial comparing conventional hardware-based gating with a data-driven approach and to describe the distribution of determined parameters. Materials and Methods This prospective study was approved by the ethics committee of the University Hospital of Münster (AZ 2014-217-f-N). Seventy-four patients suspected of having abdominal or thoracic fluorine 18 fluorodeoxyglucose (FDG)-positive lesions underwent clinical whole-body FDG PET/computed tomographic (CT) examinations. Respiratory gating was performed by using a pressure-sensitive belt system (belt gating [BG]) and an automatic data-driven approach (data-driven gating [DDG]). PET images were analyzed for lesion uptake, metabolic volumes, respiratory shifts of lesions, and diagnostic image quality. Results Forty-eight patients had at least one lesion in the field of view, resulting in a total of 164 lesions analyzed (range of number of lesions per patient, one to 13). Both gating methods revealed respiratory shifts of lesions (4.4 mm ± 3.1 for BG vs 4.8 mm ± 3.6 for DDG, P = .76). Increase in uptake of the lesions compared with nongated values did not differ significantly between both methods (maximum standardized uptake value [SUVmax], +7% ± 13 for BG vs +8% ± 16 for DDG, P = .76). Similarly, gating significantly decreased metabolic lesion volumes with both methods (-6% ± 26 for BG vs -7% ± 21 for DDG, P = .44) compared with nongated reconstructions. Blinded reading revealed significant improvements in diagnostic image quality when using gating, without significant differences between the methods (DDG was judged to be inferior to BG in 22 cases, equal in 12 cases, and superior in 15 cases; P = .32). Conclusion Respiratory gating increases diagnostic image quality and uptake values and decreases metabolic volumes compared with nongated acquisitions. Data-driven approaches are

  13. Environmental Data-Driven Inquiry and Exploration (EDDIE)- Water Focused Modules for interacting with Big Hydrologic Data

    NASA Astrophysics Data System (ADS)

    Meixner, T.; Gougis, R.; O'Reilly, C.; Klug, J.; Richardson, D.; Castendyk, D.; Carey, C.; Bader, N.; Stomberg, J.; Soule, D. C.

    2016-12-01

    High-frequency sensor data are driving a shift in the Earth and environmental sciences. The availability of high-frequency data creates an engagement opportunity for undergraduate students in primary research by using large, long-term, and sensor-based, data directly in the scientific curriculum. Project EDDIE (Environmental Data-Driven Inquiry & Exploration) has developed flexible classroom activity modules designed to meet a series of pedagogical goals that include (1) developing skills required to manipulate large datasets at different scales to conduct inquiry-based investigations; (2) developing students' reasoning about statistical variation; and (3) fostering accurate student conceptions about the nature of environmental science. The modules cover a wide range of topics, including lake physics and metabolism, stream discharge, water quality, soil respiration, seismology, and climate change. In this presentation we will focus on a sequence of modules of particular interest to hydrologists - stream discharge, water quality and nutrient loading. Assessment results show that our modules are effective at making students more comfortable analyzing data, improved understanding of statistical concepts, and stronger data analysis capability. This project is funded by an NSF TUES grant (NSF DEB 1245707).

  14. A predictive estimation method for carbon dioxide transport by data-driven modeling with a physically-based data model.

    PubMed

    Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young; Jun, Seong-Chun; Choung, Sungwook; Yun, Seong-Taek; Oh, Junho; Kim, Hyun-Jun

    2017-11-01

    In this study, a data-driven method for predicting CO 2 leaks and associated concentrations from geological CO 2 sequestration is developed. Several candidate models are compared based on their reproducibility and predictive capability for CO 2 concentration measurements from the Environment Impact Evaluation Test (EIT) site in Korea. Based on the data mining results, a one-dimensional solution of the advective-dispersive equation for steady flow (i.e., Ogata-Banks solution) is found to be most representative for the test data, and this model is adopted as the data model for the developed method. In the validation step, the method is applied to estimate future CO 2 concentrations with the reference estimation by the Ogata-Banks solution, where a part of earlier data is used as the training dataset. From the analysis, it is found that the ensemble mean of multiple estimations based on the developed method shows high prediction accuracy relative to the reference estimation. In addition, the majority of the data to be predicted are included in the proposed quantile interval, which suggests adequate representation of the uncertainty by the developed method. Therefore, the incorporation of a reasonable physically-based data model enhances the prediction capability of the data-driven model. The proposed method is not confined to estimations of CO 2 concentration and may be applied to various real-time monitoring data from subsurface sites to develop automated control, management or decision-making systems. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. A predictive estimation method for carbon dioxide transport by data-driven modeling with a physically-based data model

    NASA Astrophysics Data System (ADS)

    Jeong, Jina; Park, Eungyu; Han, Weon Shik; Kim, Kue-Young; Jun, Seong-Chun; Choung, Sungwook; Yun, Seong-Taek; Oh, Junho; Kim, Hyun-Jun

    2017-11-01

    In this study, a data-driven method for predicting CO2 leaks and associated concentrations from geological CO2 sequestration is developed. Several candidate models are compared based on their reproducibility and predictive capability for CO2 concentration measurements from the Environment Impact Evaluation Test (EIT) site in Korea. Based on the data mining results, a one-dimensional solution of the advective-dispersive equation for steady flow (i.e., Ogata-Banks solution) is found to be most representative for the test data, and this model is adopted as the data model for the developed method. In the validation step, the method is applied to estimate future CO2 concentrations with the reference estimation by the Ogata-Banks solution, where a part of earlier data is used as the training dataset. From the analysis, it is found that the ensemble mean of multiple estimations based on the developed method shows high prediction accuracy relative to the reference estimation. In addition, the majority of the data to be predicted are included in the proposed quantile interval, which suggests adequate representation of the uncertainty by the developed method. Therefore, the incorporation of a reasonable physically-based data model enhances the prediction capability of the data-driven model. The proposed method is not confined to estimations of CO2 concentration and may be applied to various real-time monitoring data from subsurface sites to develop automated control, management or decision-making systems.

  16. Technique for identifying, tracing, or tracking objects in image data

    DOEpatents

    Anderson, Robert J [Albuquerque, NM; Rothganger, Fredrick [Albuquerque, NM

    2012-08-28

    A technique for computer vision uses a polygon contour to trace an object. The technique includes rendering a polygon contour superimposed over a first frame of image data. The polygon contour is iteratively refined to more accurately trace the object within the first frame after each iteration. The refinement includes computing image energies along lengths of contour lines of the polygon contour and adjusting positions of the contour lines based at least in part on the image energies.

  17. Data-driven non-linear elasticity: constitutive manifold construction and problem discretization

    NASA Astrophysics Data System (ADS)

    Ibañez, Ruben; Borzacchiello, Domenico; Aguado, Jose Vicente; Abisset-Chavanne, Emmanuelle; Cueto, Elias; Ladeveze, Pierre; Chinesta, Francisco

    2017-11-01

    The use of constitutive equations calibrated from data has been implemented into standard numerical solvers for successfully addressing a variety problems encountered in simulation-based engineering sciences (SBES). However, the complexity remains constantly increasing due to the need of increasingly detailed models as well as the use of engineered materials. Data-Driven simulation constitutes a potential change of paradigm in SBES. Standard simulation in computational mechanics is based on the use of two very different types of equations. The first one, of axiomatic character, is related to balance laws (momentum, mass, energy,\\ldots ), whereas the second one consists of models that scientists have extracted from collected, either natural or synthetic, data. Data-driven (or data-intensive) simulation consists of directly linking experimental data to computers in order to perform numerical simulations. These simulations will employ laws, universally recognized as epistemic, while minimizing the need of explicit, often phenomenological, models. The main drawback of such an approach is the large amount of required data, some of them inaccessible from the nowadays testing facilities. Such difficulty can be circumvented in many cases, and in any case alleviated, by considering complex tests, collecting as many data as possible and then using a data-driven inverse approach in order to generate the whole constitutive manifold from few complex experimental tests, as discussed in the present work.

  18. Data-Driven Decision-Making: Facilitating Teacher Use of Student Data to Inform Classroom Instruction

    ERIC Educational Resources Information Center

    Schifter, Catherine C.; Natarajan, Uma; Ketelhut, Diane Jass; Kirchgessner, Amanda

    2014-01-01

    Data-driven decision making is essential in K-12 education today, but teachers often do not know how to make use of extensive data sets. Research shows that teachers are not taught how to use extensive data (i.e., multiple data sets) to reflect on student progress or to differentiate instruction. This paper presents a process used in an National…

  19. Evaluating MODIS satellite versus terrestrial data driven productivity estimates in Austria

    NASA Astrophysics Data System (ADS)

    Petritsch, R.; Boisvenue, C.; Pietsch, S. A.; Hasenauer, H.; Running, S. W.

    2009-04-01

    Sensors, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) on NASA's Terra satellite, are developed for monitoring global and/or regional ecosystem fluxes like net primary production (NPP). Although these systems should allow us to assess carbon sequestration issues, forest management impacts, etc., relatively little is known about the consistency and accuracy in the resulting satellite driven estimates versus production estimates driven from ground data. In this study we compare the following NPP estimation methods: (i) NPP estimates as derived from MODIS and available on the internet; (ii) estimates resulting from the off-line version of the MODIS algorithm; (iii) estimates using regional meteorological data within the offline algorithm; (iv) NPP estimates from a species specific biogeochemical ecosystem model adopted for Alpine conditions; and (v) NPP estimates calculated from individual tree measurements. Single tree measurements were available from 624 forested sites across Austria but only the data from 165 sample plots included all the necessary information for performing the comparison on plot level. To ensure independence of satellite-driven and ground-based predictions, only latitude and longitude for each site were used to obtain MODIS estimates. Along with the comparison of the different methods, we discuss problems like the differing dates of field campaigns (<1999) and acquisition of satellite images (2000-2005) or incompatible productivity definitions within the methods and come up with a framework for combining terrestrial and satellite data based productivity estimates. On average MODIS estimates agreed well with the output of the models self-initialization (spin-up) and biomass increment calculated from tree measurements is not significantly different from model results; however, correlation between satellite-derived versus terrestrial estimates are relatively poor. Considering the different scales as they are 9km² from MODIS and

  20. A data-driven, knowledge-based approach to biomarker discovery: application to circulating microRNA markers of colorectal cancer prognosis.

    PubMed

    Vafaee, Fatemeh; Diakos, Connie; Kirschner, Michaela B; Reid, Glen; Michael, Michael Z; Horvath, Lisa G; Alinejad-Rokny, Hamid; Cheng, Zhangkai Jason; Kuncic, Zdenka; Clarke, Stephen

    2018-01-01

    Recent advances in high-throughput technologies have provided an unprecedented opportunity to identify molecular markers of disease processes. This plethora of complex-omics data has simultaneously complicated the problem of extracting meaningful molecular signatures and opened up new opportunities for more sophisticated integrative and holistic approaches. In this era, effective integration of data-driven and knowledge-based approaches for biomarker identification has been recognised as key to improving the identification of high-performance biomarkers, and necessary for translational applications. Here, we have evaluated the role of circulating microRNA as a means of predicting the prognosis of patients with colorectal cancer, which is the second leading cause of cancer-related death worldwide. We have developed a multi-objective optimisation method that effectively integrates a data-driven approach with the knowledge obtained from the microRNA-mediated regulatory network to identify robust plasma microRNA signatures which are reliable in terms of predictive power as well as functional relevance. The proposed multi-objective framework has the capacity to adjust for conflicting biomarker objectives and to incorporate heterogeneous information facilitating systems approaches to biomarker discovery. We have found a prognostic signature of colorectal cancer comprising 11 circulating microRNAs. The identified signature predicts the patients' survival outcome and targets pathways underlying colorectal cancer progression. The altered expression of the identified microRNAs was confirmed in an independent public data set of plasma samples of patients in early stage vs advanced colorectal cancer. Furthermore, the generality of the proposed method was demonstrated across three publicly available miRNA data sets associated with biomarker studies in other diseases.

  1. Data-Driven Exercises for Chemistry: A New Digital Collection

    ERIC Educational Resources Information Center

    Grubbs, W. Tandy

    2007-01-01

    The analysis presents a new digital collection for various data-driven exercises that are used for teaching chemistry to the students. Such methods are expected to help the students to think in a more scientific manner.

  2. Defining datasets and creating data dictionaries for quality improvement and research in chronic disease using routinely collected data: an ontology-driven approach.

    PubMed

    de Lusignan, Simon; Liaw, Siaw-Teng; Michalakidis, Georgios; Jones, Simon

    2011-01-01

    The burden of chronic disease is increasing, and research and quality improvement will be less effective if case finding strategies are suboptimal. To describe an ontology-driven approach to case finding in chronic disease and how this approach can be used to create a data dictionary and make the codes used in case finding transparent. A five-step process: (1) identifying a reference coding system or terminology; (2) using an ontology-driven approach to identify cases; (3) developing metadata that can be used to identify the extracted data; (4) mapping the extracted data to the reference terminology; and (5) creating the data dictionary. Hypertension is presented as an exemplar. A patient with hypertension can be represented by a range of codes including diagnostic, history and administrative. Metadata can link the coding system and data extraction queries to the correct data mapping and translation tool, which then maps it to the equivalent code in the reference terminology. The code extracted, the term, its domain and subdomain, and the name of the data extraction query can then be automatically grouped and published online as a readily searchable data dictionary. An exemplar online is: www.clininf.eu/qickd-data-dictionary.html Adopting an ontology-driven approach to case finding could improve the quality of disease registers and of research based on routine data. It would offer considerable advantages over using limited datasets to define cases. This approach should be considered by those involved in research and quality improvement projects which utilise routine data.

  3. Data-Driven Modeling and Prediction of Arctic Sea Ice

    NASA Astrophysics Data System (ADS)

    Kondrashov, Dmitri; Chekroun, Mickael; Ghil, Michael

    2016-04-01

    We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to probabilistic prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales. This approach is applied to monthly time series of state-of-the-art data-adaptive decompositions of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" for up to 6-months ahead. It will be shown in particular that the memory effects included intrinsically in the formulation of our non-Markovian MSM models allow for improvements of the prediction skill of large-amplitude SIC anomalies in certain Arctic regions on the one hand, and of September Sea Ice Extent, on the other. Further improvements allowed by the MSM framework will adopt a nonlinear formulation and explore next-generation data-adaptive decompositions, namely modification of Principal Oscillation Patterns (POPs) and rotated Multichannel Singular Spectrum Analysis (M-SSA).

  4. Data-Intensive Science Meets Inquiry-Driven Pedagogy: Interactive Big Data Exploration, Threshold Concepts, and Liminality

    NASA Astrophysics Data System (ADS)

    Ramachandran, R.; Nair, U. S.; Word, A.

    2014-12-01

    Threshold concepts in any discipline are the core concepts an individual must understand in order to master a discipline. By their very nature, these concepts are troublesome, irreversible, integrative, bounded, discursive, and reconstitutive. Although grasping threshold concepts can be extremely challenging for each learner as s/he moves through stages of cognitive development relative to a given discipline, the learner's grasp of these concepts determines the extent to which s/he is prepared to work competently and creatively within the field itself. The movement of individuals from a state of ignorance of these core concepts to one of mastery occurs not along a linear path but in iterative cycles of knowledge creation and adjustment in liminal spaces - conceptual spaces through which learners move from the vaguest awareness of concepts to mastery, accompanied by understanding of their relevance, connectivity, and usefulness relative to questions and constructs in a given discipline. With the explosive growth of data available in atmospheric science, driven largely by satellite Earth observations and high-resolution numerical simulations, paradigms such as that of data-intensive science have emerged. These paradigm shifts are based on the growing realization that current infrastructure, tools and processes will not allow us to analyze and fully utilize the complex and voluminous data that is being gathered. In this emerging paradigm, the scientific discovery process is driven by knowledge extracted from large volumes of data. In this presentation, we contend that this paradigm naturally lends to inquiry-driven pedagogy where knowledge is discovered through inductive engagement with large volumes of data rather than reached through traditional, deductive, hypothesis-driven analyses. In particular, data-intensive techniques married with an inductive methodology allow for exploration on a scale that is not possible in the traditional classroom with its typical

  5. Data driven approaches vs. qualitative approaches in climate change impact and vulnerability assessment.

    NASA Astrophysics Data System (ADS)

    Zebisch, Marc; Schneiderbauer, Stefan; Petitta, Marcello

    2015-04-01

    In the last decade the scope of climate change science has broadened significantly. 15 years ago the focus was mainly on understanding climate change, providing climate change scenarios and giving ideas about potential climate change impacts. Today, adaptation to climate change has become an increasingly important field of politics and one role of science is to inform and consult this process. Therefore, climate change science is not anymore focusing on data driven approaches only (such as climate or climate impact models) but is progressively applying and relying on qualitative approaches including opinion and expertise acquired through interactive processes with local stakeholders and decision maker. Furthermore, climate change science is facing the challenge of normative questions, such us 'how important is a decrease of yield in a developed country where agriculture only represents 3% of the GDP and the supply with agricultural products is strongly linked to global markets and less depending on local production?'. In this talk we will present examples from various applied research and consultancy projects on climate change vulnerabilities including data driven methods (e.g. remote sensing and modelling) to semi-quantitative and qualitative assessment approaches. Furthermore, we will discuss bottlenecks, pitfalls and opportunities in transferring climate change science to policy and decision maker oriented climate services.

  6. DATA QUALITY OBJECTIVES AND MEASUREMENT QUALITY OBJECTIVES FOR RESEARCH PROJECTS

    EPA Science Inventory

    The paper provides assistance with systematic planning using measurement quality objectives to those working on research projects. These performance criteria are more familiar to researchers than data quality objectives because they are more closely associated with the measuremen...

  7. Developing Annotation Solutions for Online Data Driven Learning

    ERIC Educational Resources Information Center

    Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.

    2009-01-01

    Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…

  8. Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators.

    PubMed

    Xu, Wenjun; Chen, Jie; Lau, Henry Y K; Ren, Hongliang

    2017-09-01

    Accurate motion control of flexible surgical manipulators is crucial in tissue manipulation tasks. The tendon-driven serpentine manipulator (TSM) is one of the most widely adopted flexible mechanisms in minimally invasive surgery because of its enhanced maneuverability in torturous environments. TSM, however, exhibits high nonlinearities and conventional analytical kinematics model is insufficient to achieve high accuracy. To account for the system nonlinearities, we applied a data driven approach to encode the system inverse kinematics. Three regression methods: extreme learning machine (ELM), Gaussian mixture regression (GMR) and K-nearest neighbors regression (KNNR) were implemented to learn a nonlinear mapping from the robot 3D position states to the control inputs. The performance of the three algorithms was evaluated both in simulation and physical trajectory tracking experiments. KNNR performed the best in the tracking experiments, with the lowest RMSE of 2.1275 mm. The proposed inverse kinematics learning methods provide an alternative and efficient way to accurately model the tendon driven flexible manipulator. Copyright © 2016 John Wiley & Sons, Ltd.

  9. On Mixed Data and Event Driven Design for Adaptive-Critic-Based Nonlinear $H_{\\infty}$ Control.

    PubMed

    Wang, Ding; Mu, Chaoxu; Liu, Derong; Ma, Hongwen

    2018-04-01

    In this paper, based on the adaptive critic learning technique, the control for a class of unknown nonlinear dynamic systems is investigated by adopting a mixed data and event driven design approach. The nonlinear control problem is formulated as a two-player zero-sum differential game and the adaptive critic method is employed to cope with the data-based optimization. The novelty lies in that the data driven learning identifier is combined with the event driven design formulation, in order to develop the adaptive critic controller, thereby accomplishing the nonlinear control. The event driven optimal control law and the time driven worst case disturbance law are approximated by constructing and tuning a critic neural network. Applying the event driven feedback control, the closed-loop system is built with stability analysis. Simulation studies are conducted to verify the theoretical results and illustrate the control performance. It is significant to observe that the present research provides a new avenue of integrating data-based control and event-triggering mechanism into establishing advanced adaptive critic systems.

  10. Nonstationary EO/IR Clutter Suppression and Dim Object Tracking

    NASA Astrophysics Data System (ADS)

    Tartakovsky, A.; Brown, A.; Brown, J.

    2010-09-01

    We develop and evaluate the performance of advanced algorithms which provide significantly improved capabilities for automated detection and tracking of ballistic and flying dim objects in the presence of highly structured intense clutter. Applications include ballistic missile early warning, midcourse tracking, trajectory prediction, and resident space object detection and tracking. The set of algorithms include, in particular, adaptive spatiotemporal clutter estimation-suppression and nonlinear filtering-based multiple-object track-before-detect. These algorithms are suitable for integration into geostationary, highly elliptical, or low earth orbit scanning or staring sensor suites, and are based on data-driven processing that adapts to real-world clutter backgrounds, including celestial, earth limb, or terrestrial clutter. In many scenarios of interest, e.g., for highly elliptic and, especially, low earth orbits, the resulting clutter is highly nonstationary, providing a significant challenge for clutter suppression to or below sensor noise levels, which is essential for dim object detection and tracking. We demonstrate the success of the developed algorithms using semi-synthetic and real data. In particular, our algorithms are shown to be capable of detecting and tracking point objects with signal-to-clutter levels down to 1/1000 and signal-to-noise levels down to 1/4.

  11. USACM Thematic Workshop On Uncertainty Quantification And Data-Driven Modeling.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stewart, James R.

    The USACM Thematic Workshop on Uncertainty Quantification and Data-Driven Modeling was held on March 23-24, 2017, in Austin, TX. The organizers of the technical program were James R. Stewart of Sandia National Laboratories and Krishna Garikipati of University of Michigan. The administrative organizer was Ruth Hengst, who serves as Program Coordinator for the USACM. The organization of this workshop was coordinated through the USACM Technical Thrust Area on Uncertainty Quantification and Probabilistic Analysis. The workshop website (http://uqpm2017.usacm.org) includes the presentation agenda as well as links to several of the presentation slides (permission to access the presentations was granted by eachmore » of those speakers, respectively). Herein, this final report contains the complete workshop program that includes the presentation agenda, the presentation abstracts, and the list of posters.« less

  12. Data-driven identification of potential Zika virus vectors

    PubMed Central

    Evans, Michelle V; Dallas, Tad A; Han, Barbara A; Murdock, Courtney C; Drake, John M

    2017-01-01

    Zika is an emerging virus whose rapid spread is of great public health concern. Knowledge about transmission remains incomplete, especially concerning potential transmission in geographic areas in which it has not yet been introduced. To identify unknown vectors of Zika, we developed a data-driven model linking vector species and the Zika virus via vector-virus trait combinations that confer a propensity toward associations in an ecological network connecting flaviviruses and their mosquito vectors. Our model predicts that thirty-five species may be able to transmit the virus, seven of which are found in the continental United States, including Culex quinquefasciatus and Cx. pipiens. We suggest that empirical studies prioritize these species to confirm predictions of vector competence, enabling the correct identification of populations at risk for transmission within the United States. DOI: http://dx.doi.org/10.7554/eLife.22053.001 PMID:28244371

  13. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models.

    PubMed

    Henriques, David; Villaverde, Alejandro F; Rocha, Miguel; Saez-Rodriguez, Julio; Banga, Julio R

    2017-02-01

    Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM's ensemble prediction is not only consistently better than predictions

  14. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models

    PubMed Central

    Henriques, David; Villaverde, Alejandro F.; Banga, Julio R.

    2017-01-01

    Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM’s ensemble prediction is not only consistently better than predictions

  15. Contrasting analytical and data-driven frameworks for radiogenomic modeling of normal tissue toxicities in prostate cancer.

    PubMed

    Coates, James; Jeyaseelan, Asha K; Ybarra, Norma; David, Marc; Faria, Sergio; Souhami, Luis; Cury, Fabio; Duclos, Marie; El Naqa, Issam

    2015-04-01

    We explore analytical and data-driven approaches to investigate the integration of genetic variations (single nucleotide polymorphisms [SNPs] and copy number variations [CNVs]) with dosimetric and clinical variables in modeling radiation-induced rectal bleeding (RB) and erectile dysfunction (ED) in prostate cancer patients. Sixty-two patients who underwent curative hypofractionated radiotherapy (66 Gy in 22 fractions) between 2002 and 2010 were retrospectively genotyped for CNV and SNP rs5489 in the xrcc1 DNA repair gene. Fifty-four patients had full dosimetric profiles. Two parallel modeling approaches were compared to assess the risk of severe RB (Grade⩾3) and ED (Grade⩾1); Maximum likelihood estimated generalized Lyman-Kutcher-Burman (LKB) and logistic regression. Statistical resampling based on cross-validation was used to evaluate model predictive power and generalizability to unseen data. Integration of biological variables xrcc1 CNV and SNP improved the fit of the RB and ED analytical and data-driven models. Cross-validation of the generalized LKB models yielded increases in classification performance of 27.4% for RB and 14.6% for ED when xrcc1 CNV and SNP were included, respectively. Biological variables added to logistic regression modeling improved classification performance over standard dosimetric models by 33.5% for RB and 21.2% for ED models. As a proof-of-concept, we demonstrated that the combination of genetic and dosimetric variables can provide significant improvement in NTCP prediction using analytical and data-driven approaches. The improvement in prediction performance was more pronounced in the data driven approaches. Moreover, we have shown that CNVs, in addition to SNPs, may be useful structural genetic variants in predicting radiation toxicities. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Exploring Techniques of Developing Writing Skill in IELTS Preparatory Courses: A Data-Driven Study

    ERIC Educational Resources Information Center

    Ostovar-Namaghi, Seyyed Ali; Safaee, Seyyed Esmail

    2017-01-01

    Being driven by the hypothetico-deductive mode of inquiry, previous studies have tested the effectiveness of theory-driven interventions under controlled experimental conditions to come up with universally applicable generalizations. To make a case in the opposite direction, this data-driven study aims at uncovering techniques and strategies…

  17. Personalized Mortality Prediction Driven by Electronic Medical Data and a Patient Similarity Metric

    PubMed Central

    Lee, Joon; Maslove, David M.; Dubin, Joel A.

    2015-01-01

    Background Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. Methods and Findings We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. Conclusions The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our

  18. Asynchronous Data-Driven Classification of Weapon Systems

    DTIC Science & Technology

    2009-10-01

    Classification of Weapon SystemsF Xin Jin† Kushal Mukherjee† Shalabh Gupta† Asok Ray † Shashi Phoha† Thyagaraju Damarla‡ xuj103@psu.edu kum162@psu.edu szg107...Orlando, FL. [8] A. Ray , “Symbolic dynamic analysis of complex systems for anomaly detection,” Signal Processing, vol. 84, no. 7, pp. 1115–1130, July...2004. [9] S. Gupta and A. Ray , “Symbolic dynamic filtering for data-driven pat- tern recognition,” PATTERN RECOGNITION: Theory and Application

  19. Architectural Strategies for Enabling Data-Driven Science at Scale

    NASA Astrophysics Data System (ADS)

    Crichton, D. J.; Law, E. S.; Doyle, R. J.; Little, M. M.

    2017-12-01

    architectural strategies, including a 2015-2016 NASA AIST Study on Big Data, for evolving scientific research towards massively distributed data-driven discovery. It will include example use cases across earth science, planetary science, and other disciplines.

  20. Dynamically adaptive data-driven simulation of extreme hydrological flows

    NASA Astrophysics Data System (ADS)

    Kumar Jain, Pushkar; Mandli, Kyle; Hoteit, Ibrahim; Knio, Omar; Dawson, Clint

    2018-02-01

    Hydrological hazards such as storm surges, tsunamis, and rainfall-induced flooding are physically complex events that are costly in loss of human life and economic productivity. Many such disasters could be mitigated through improved emergency evacuation in real-time and through the development of resilient infrastructure based on knowledge of how systems respond to extreme events. Data-driven computational modeling is a critical technology underpinning these efforts. This investigation focuses on the novel combination of methodologies in forward simulation and data assimilation. The forward geophysical model utilizes adaptive mesh refinement (AMR), a process by which a computational mesh can adapt in time and space based on the current state of a simulation. The forward solution is combined with ensemble based data assimilation methods, whereby observations from an event are assimilated into the forward simulation to improve the veracity of the solution, or used to invert for uncertain physical parameters. The novelty in our approach is the tight two-way coupling of AMR and ensemble filtering techniques. The technology is tested using actual data from the Chile tsunami event of February 27, 2010. These advances offer the promise of significantly transforming data-driven, real-time modeling of hydrological hazards, with potentially broader applications in other science domains.

  1. Program objectives for the National Water Data Exchange (NAWDEX) for fiscal year 1979

    USGS Publications Warehouse

    Edwards, Melvin D.

    1978-01-01

    This report describes the program objectives of the National Water Data Exchange (NAWDEX) for Fiscal Year 1979. These objectives include NAWDEX membership, program administration, management, and coordination, NAWDEX services, identification of sources of water data, indexing of water data, programs and systems documentation, recommended methods for the handling and exchange of water data, training, and technical assistance to NAWDEX members. (Woodard-USGS)

  2. Identifying Data-Driven Instructional Systems. Research to Practice Brief

    ERIC Educational Resources Information Center

    Lawrence, K. S.

    2016-01-01

    The study summarized in this research to practice brief, "Creating data-driven instructional systems in school: The new instructional leadership," Halverson, R., Grigg, J., Pritchett, R., & Thomas, C. (2015), "Journal of School Leadership," 25. 447-481, investigated whether student outcome improvements were linked to the…

  3. Using Data-Driven Model-Brain Mappings to Constrain Formal Models of Cognition

    PubMed Central

    Borst, Jelmer P.; Nijboer, Menno; Taatgen, Niels A.; van Rijn, Hedderik; Anderson, John R.

    2015-01-01

    In this paper we propose a method to create data-driven mappings from components of cognitive models to brain regions. Cognitive models are notoriously hard to evaluate, especially based on behavioral measures alone. Neuroimaging data can provide additional constraints, but this requires a mapping from model components to brain regions. Although such mappings can be based on the experience of the modeler or on a reading of the literature, a formal method is preferred to prevent researcher-based biases. In this paper we used model-based fMRI analysis to create a data-driven model-brain mapping for five modules of the ACT-R cognitive architecture. We then validated this mapping by applying it to two new datasets with associated models. The new mapping was at least as powerful as an existing mapping that was based on the literature, and indicated where the models were supported by the data and where they have to be improved. We conclude that data-driven model-brain mappings can provide strong constraints on cognitive models, and that model-based fMRI is a suitable way to create such mappings. PMID:25747601

  4. Network Model-Assisted Inference from Respondent-Driven Sampling Data.

    PubMed

    Gile, Krista J; Handcock, Mark S

    2015-06-01

    Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.

  5. Action-Driven Visual Object Tracking With Deep Reinforcement Learning.

    PubMed

    Yun, Sangdoo; Choi, Jongwon; Yoo, Youngjoon; Yun, Kimin; Choi, Jin Young

    2018-06-01

    In this paper, we propose an efficient visual tracker, which directly captures a bounding box containing the target object in a video by means of sequential actions learned using deep neural networks. The proposed deep neural network to control tracking actions is pretrained using various training video sequences and fine-tuned during actual tracking for online adaptation to a change of target and background. The pretraining is done by utilizing deep reinforcement learning (RL) as well as supervised learning. The use of RL enables even partially labeled data to be successfully utilized for semisupervised learning. Through the evaluation of the object tracking benchmark data set, the proposed tracker is validated to achieve a competitive performance at three times the speed of existing deep network-based trackers. The fast version of the proposed method, which operates in real time on graphics processing unit, outperforms the state-of-the-art real-time trackers with an accuracy improvement of more than 8%.

  6. Limited angle CT reconstruction by simultaneous spatial and Radon domain regularization based on TV and data-driven tight frame

    NASA Astrophysics Data System (ADS)

    Zhang, Wenkun; Zhang, Hanming; Wang, Linyuan; Cai, Ailong; Li, Lei; Yan, Bin

    2018-02-01

    Limited angle computed tomography (CT) reconstruction is widely performed in medical diagnosis and industrial testing because of the size of objects, engine/armor inspection requirements, and limited scan flexibility. Limited angle reconstruction necessitates usage of optimization-based methods that utilize additional sparse priors. However, most of conventional methods solely exploit sparsity priors of spatial domains. When CT projection suffers from serious data deficiency or various noises, obtaining reconstruction images that meet the requirement of quality becomes difficult and challenging. To solve this problem, this paper developed an adaptive reconstruction method for limited angle CT problem. The proposed method simultaneously uses spatial and Radon domain regularization model based on total variation (TV) and data-driven tight frame. Data-driven tight frame being derived from wavelet transformation aims at exploiting sparsity priors of sinogram in Radon domain. Unlike existing works that utilize pre-constructed sparse transformation, the framelets of the data-driven regularization model can be adaptively learned from the latest projection data in the process of iterative reconstruction to provide optimal sparse approximations for given sinogram. At the same time, an effective alternating direction method is designed to solve the simultaneous spatial and Radon domain regularization model. The experiments for both simulation and real data demonstrate that the proposed algorithm shows better performance in artifacts depression and details preservation than the algorithms solely using regularization model of spatial domain. Quantitative evaluations for the results also indicate that the proposed algorithm applying learning strategy performs better than the dual domains algorithms without learning regularization model

  7. Task-Driven Optimization of Fluence Field and Regularization for Model-Based Iterative Reconstruction in Computed Tomography.

    PubMed

    Gang, Grace J; Siewerdsen, Jeffrey H; Stayman, J Webster

    2017-12-01

    This paper presents a joint optimization of dynamic fluence field modulation (FFM) and regularization in quadratic penalized-likelihood reconstruction that maximizes a task-based imaging performance metric. We adopted a task-driven imaging framework for prospective designs of the imaging parameters. A maxi-min objective function was adopted to maximize the minimum detectability index ( ) throughout the image. The optimization algorithm alternates between FFM (represented by low-dimensional basis functions) and local regularization (including the regularization strength and directional penalty weights). The task-driven approach was compared with three FFM strategies commonly proposed for FBP reconstruction (as well as a task-driven TCM strategy) for a discrimination task in an abdomen phantom. The task-driven FFM assigned more fluence to less attenuating anteroposterior views and yielded approximately constant fluence behind the object. The optimal regularization was almost uniform throughout image. Furthermore, the task-driven FFM strategy redistribute fluence across detector elements in order to prescribe more fluence to the more attenuating central region of the phantom. Compared with all strategies, the task-driven FFM strategy not only improved minimum by at least 17.8%, but yielded higher over a large area inside the object. The optimal FFM was highly dependent on the amount of regularization, indicating the importance of a joint optimization. Sample reconstructions of simulated data generally support the performance estimates based on computed . The improvements in detectability show the potential of the task-driven imaging framework to improve imaging performance at a fixed dose, or, equivalently, to provide a similar level of performance at reduced dose.

  8. Toward a more data-driven supervision of collegiate counseling centers.

    PubMed

    Varlotta, Lori E

    2012-01-01

    Hearing the national call for higher education accountability, the author of this tripartite article urges university administrators to move towards a more data-driven approach to counseling center supervision. Toward that end, the author first examines a key factor--perceived increase in student pathology--that appears to shape budget and staffing decisions in many university centers. Second, she reviews the emerging but conflicting research of clinician-scholars who are trying to empirically verify or refute that perception; their conflicting results suggest that no study alone should be used as the "final word" in evidence-based decision-making. Third, the author delineates the campus-specific data that should be gathered to guide staffing and budgeting decisions on each campus. She concludes by reminding readers that data-driven decisions can and should foster high-quality care that is concurrently efficient, effective, and in sync with the needs of a particular university and student body.

  9. Object recognition and pose estimation of planar objects from range data

    NASA Technical Reports Server (NTRS)

    Pendleton, Thomas W.; Chien, Chiun Hong; Littlefield, Mark L.; Magee, Michael

    1994-01-01

    The Extravehicular Activity Helper/Retriever (EVAHR) is a robotic device currently under development at the NASA Johnson Space Center that is designed to fetch objects or to assist in retrieving an astronaut who may have become inadvertently de-tethered. The EVAHR will be required to exhibit a high degree of intelligent autonomous operation and will base much of its reasoning upon information obtained from one or more three-dimensional sensors that it will carry and control. At the highest level of visual cognition and reasoning, the EVAHR will be required to detect objects, recognize them, and estimate their spatial orientation and location. The recognition phase and estimation of spatial pose will depend on the ability of the vision system to reliably extract geometric features of the objects such as whether the surface topologies observed are planar or curved and the spatial relationships between the component surfaces. In order to achieve these tasks, three-dimensional sensing of the operational environment and objects in the environment will therefore be essential. One of the sensors being considered to provide image data for object recognition and pose estimation is a phase-shift laser scanner. The characteristics of the data provided by this scanner have been studied and algorithms have been developed for segmenting range images into planar surfaces, extracting basic features such as surface area, and recognizing the object based on the characteristics of extracted features. Also, an approach has been developed for estimating the spatial orientation and location of the recognized object based on orientations of extracted planes and their intersection points. This paper presents some of the algorithms that have been developed for the purpose of recognizing and estimating the pose of objects as viewed by the laser scanner, and characterizes the desirability and utility of these algorithms within the context of the scanner itself, considering data quality and

  10. Electrically Driven Liquid Film Boiling Experiment

    NASA Technical Reports Server (NTRS)

    Didion, Jeffrey R.

    2016-01-01

    This presentation presents the science background and ground based results that form the basis of the Electrically Driven Liquid Film Boiling Experiment. This is an ISS experiment that is manifested for 2021. Objective: Characterize the effects of gravity on the interaction of electric and flow fields in the presence of phase change specifically pertaining to: a) The effects of microgravity on the electrically generated two-phase flow. b) The effects of microgravity on electrically driven liquid film boiling (includes extreme heat fluxes). Electro-wetting of the boiling section will repel the bubbles away from the heated surface in microgravity environment. Relevance/Impact: Provides phenomenological foundation for the development of electric field based two-phase thermal management systems leveraging EHD, permitting optimization of heat transfer surface area to volume ratios as well as achievement of high heat transfer coefficients thus resulting in system mass and volume savings. EHD replaces buoyancy or flow driven bubble removal from heated surface. Development Approach: Conduct preliminary experiments in low gravity and ground-based facilities to refine technique and obtain preliminary data for model development. ISS environment required to characterize electro-wetting effect on nucleate boiling and CHF in the absence of gravity. Will operate in the FIR - designed for autonomous operation.

  11. Data-driven Climate Modeling and Prediction

    NASA Astrophysics Data System (ADS)

    Kondrashov, D. A.; Chekroun, M.

    2016-12-01

    Global climate models aim to simulate a broad range of spatio-temporal scales of climate variability with state vector having many millions of degrees of freedom. On the other hand, while detailed weather prediction out to a few days requires high numerical resolution, it is fairly clear that a major fraction of large-scale climate variability can be predicted in a much lower-dimensional phase space. Low-dimensional models can simulate and predict this fraction of climate variability, provided they are able to account for linear and nonlinear interactions between the modes representing large scales of climate dynamics, as well as their interactions with a much larger number of modes representing fast and small scales. This presentation will highlight several new applications by Multilayered Stochastic Modeling (MSM) [Kondrashov, Chekroun and Ghil, 2015] framework that has abundantly proven its efficiency in the modeling and real-time forecasting of various climate phenomena. MSM is a data-driven inverse modeling technique that aims to obtain a low-order nonlinear system of prognostic equations driven by stochastic forcing, and estimates both the dynamical operator and the properties of the driving noise from multivariate time series of observations or a high-end model's simulation. MSM leads to a system of stochastic differential equations (SDEs) involving hidden (auxiliary) variables of fast-small scales ranked by layers, which interact with the macroscopic (observed) variables of large-slow scales to model the dynamics of the latter, and thus convey memory effects. New MSM climate applications focus on development of computationally efficient low-order models by using data-adaptive decomposition methods that convey memory effects by time-embedding techniques, such as Multichannel Singular Spectrum Analysis (M-SSA) [Ghil et al. 2002] and recently developed Data-Adaptive Harmonic (DAH) decomposition method [Chekroun and Kondrashov, 2016]. In particular, new results

  12. NASA Reverb: Standards-Driven Earth Science Data and Service Discovery

    NASA Astrophysics Data System (ADS)

    Cechini, M. F.; Mitchell, A.; Pilone, D.

    2011-12-01

    NASA's Earth Observing System Data and Information System (EOSDIS) is a core capability in NASA's Earth Science Data Systems Program. NASA's EOS ClearingHOuse (ECHO) is a metadata catalog for the EOSDIS, providing a centralized catalog of data products and registry of related data services. Working closely with the EOSDIS community, the ECHO team identified a need to develop the next generation EOS data and service discovery tool. This development effort relied on the following principles: + Metadata Driven User Interface - Users should be presented with data and service discovery capabilities based on dynamic processing of metadata describing the targeted data. + Integrated Data & Service Discovery - Users should be able to discovery data and associated data services that facilitate their research objectives. + Leverage Common Standards - Users should be able to discover and invoke services that utilize common interface standards. Metadata plays a vital role facilitating data discovery and access. As data providers enhance their metadata, more advanced search capabilities become available enriching a user's search experience. Maturing metadata formats such as ISO 19115 provide the necessary depth of metadata that facilitates advanced data discovery capabilities. Data discovery and access is not limited to simply the retrieval of data granules, but is growing into the more complex discovery of data services. These services include, but are not limited to, services facilitating additional data discovery, subsetting, reformatting, and re-projecting. The discovery and invocation of these data services is made significantly simpler through the use of consistent and interoperable standards. By utilizing an adopted standard, developing standard-specific adapters can be utilized to communicate with multiple services implementing a specific protocol. The emergence of metadata standards such as ISO 19119 plays a similarly important role in discovery as the 19115 standard

  13. Microwave Driven Actuators Power Allocation and Distribution

    NASA Technical Reports Server (NTRS)

    Forbes, Timothy; Song, Kyo D.

    2000-01-01

    Design, fabrication and test of a power allocation and distribution (PAD) network for microwave driven actuators is presented in this paper. Development of a circuit that would collect power from a rectenna array amplify and distribute the power to actuators was designed and fabricated for space application in an actuator array driven by a microwave. A P-SPICE model was constructed initially for data reduction purposes, and was followed by a working real-world model. A voltage up - converter (VUC) is used to amplify the voltage from the individual rectenna. The testing yielded a 26:1 voltage amplification ratio with input voltage at 9 volts and a measured output voltage 230VDC. Future work includes the miniaturization of the circuitry, the use of microwave remote control, and voltage amplification technology for each voltage source. The objective of this work is to develop a model system that will collect DC voltage from an array of rectenna and propagate the voltage to an array of actuators.

  14. Data-Driven Leadership: Determining Your Indicators and Building Your Dashboards

    ERIC Educational Resources Information Center

    Copeland, Mo

    2016-01-01

    For years, schools have tended to approach budgets with some basic assumptions and aspirations and general wish lists but with scant data to drive the budget conversation. Suppose there were a better way? What if the conversation started with a review of the last five to ten years of data on three key mission- and strategy-driven indicators:…

  15. 3-D Object Recognition from Point Cloud Data

    NASA Astrophysics Data System (ADS)

    Smith, W.; Walker, A. S.; Zhang, B.

    2011-09-01

    The market for real-time 3-D mapping includes not only traditional geospatial applications but also navigation of unmanned autonomous vehicles (UAVs). Massively parallel processes such as graphics processing unit (GPU) computing make real-time 3-D object recognition and mapping achievable. Geospatial technologies such as digital photogrammetry and GIS offer advanced capabilities to produce 2-D and 3-D static maps using UAV data. The goal is to develop real-time UAV navigation through increased automation. It is challenging for a computer to identify a 3-D object such as a car, a tree or a house, yet automatic 3-D object recognition is essential to increasing the productivity of geospatial data such as 3-D city site models. In the past three decades, researchers have used radiometric properties to identify objects in digital imagery with limited success, because these properties vary considerably from image to image. Consequently, our team has developed software that recognizes certain types of 3-D objects within 3-D point clouds. Although our software is developed for modeling, simulation and visualization, it has the potential to be valuable in robotics and UAV applications. The locations and shapes of 3-D objects such as buildings and trees are easily recognizable by a human from a brief glance at a representation of a point cloud such as terrain-shaded relief. The algorithms to extract these objects have been developed and require only the point cloud and minimal human inputs such as a set of limits on building size and a request to turn on a squaring option. The algorithms use both digital surface model (DSM) and digital elevation model (DEM), so software has also been developed to derive the latter from the former. The process continues through the following steps: identify and group 3-D object points into regions; separate buildings and houses from trees; trace region boundaries; regularize and simplify boundary polygons; construct complex roofs. Several case

  16. A Novel Data-Driven Approach to Preoperative Mapping of Functional Cortex Using Resting-State Functional Magnetic Resonance Imaging

    PubMed Central

    Mitchell, Timothy J.; Hacker, Carl D.; Breshears, Jonathan D.; Szrama, Nick P.; Sharma, Mohit; Bundy, David T.; Pahwa, Mrinal; Corbetta, Maurizio; Snyder, Abraham Z.; Shimony, Joshua S.

    2013-01-01

    BACKGROUND: Recent findings associated with resting-state cortical networks have provided insight into the brain's organizational structure. In addition to their neuroscientific implications, the networks identified by resting-state functional magnetic resonance imaging (rs-fMRI) may prove useful for clinical brain mapping. OBJECTIVE: To demonstrate that a data-driven approach to analyze resting-state networks (RSNs) is useful in identifying regions classically understood to be eloquent cortex as well as other functional networks. METHODS: This study included 6 patients undergoing surgical treatment for intractable epilepsy and 7 patients undergoing tumor resection. rs-fMRI data were obtained before surgery and 7 canonical RSNs were identified by an artificial neural network algorithm. Of these 7, the motor and language networks were then compared with electrocortical stimulation (ECS) as the gold standard in the epilepsy patients. The sensitivity and specificity for identifying these eloquent sites were calculated at varying thresholds, which yielded receiver-operating characteristic (ROC) curves and their associated area under the curve (AUC). RSNs were plotted in the tumor patients to observe RSN distortions in altered anatomy. RESULTS: The algorithm robustly identified all networks in all patients, including those with distorted anatomy. When all ECS-positive sites were considered for motor and language, rs-fMRI had AUCs of 0.80 and 0.64, respectively. When the ECS-positive sites were analyzed pairwise, rs-fMRI had AUCs of 0.89 and 0.76 for motor and language, respectively. CONCLUSION: A data-driven approach to rs-fMRI may be a new and efficient method for preoperative localization of numerous functional brain regions. ABBREVIATIONS: AUC, area under the curve BA, Brodmann area BOLD, blood oxygen level dependent ECS, electrocortical stimulation fMRI, functional magnetic resonance imaging ICA, independent component analysis MLP, multilayer perceptron MP

  17. The search for structure - Object classification in large data sets. [for astronomers

    NASA Technical Reports Server (NTRS)

    Kurtz, Michael J.

    1988-01-01

    Research concerning object classifications schemes are reviewed, focusing on large data sets. Classification techniques are discussed, including syntactic, decision theoretic methods, fuzzy techniques, and stochastic and fuzzy grammars. Consideration is given to the automation of MK classification (Morgan and Keenan, 1973) and other problems associated with the classification of spectra. In addition, the classification of galaxies is examined, including the problems of systematic errors, blended objects, galaxy types, and galaxy clusters.

  18. Reproducibility of data-driven dietary patterns in two groups of adult Spanish women from different studies.

    PubMed

    Castelló, Adela; Lope, Virginia; Vioque, Jesús; Santamariña, Carmen; Pedraz-Pingarrón, Carmen; Abad, Soledad; Ederra, Maria; Salas-Trejo, Dolores; Vidal, Carmen; Sánchez-Contador, Carmen; Aragonés, Nuria; Pérez-Gómez, Beatriz; Pollán, Marina

    2016-08-01

    The objective of the present study was to assess the reproducibility of data-driven dietary patterns in different samples extracted from similar populations. Dietary patterns were extracted by applying principal component analyses to the dietary information collected from a sample of 3550 women recruited from seven screening centres belonging to the Spanish breast cancer (BC) screening network (Determinants of Mammographic Density in Spain (DDM-Spain) study). The resulting patterns were compared with three dietary patterns obtained from a previous Spanish case-control study on female BC (Epidemiological study of the Spanish group for breast cancer research (GEICAM: grupo Español de investigación en cáncer de mama)) using the dietary intake data of 973 healthy participants. The level of agreement between patterns was determined using both the congruence coefficient (CC) between the pattern loadings (considering patterns with a CC≥0·85 as fairly similar) and the linear correlation between patterns scores (considering as fairly similar those patterns with a statistically significant correlation). The conclusions reached with both methods were compared. This is the first study exploring the reproducibility of data-driven patterns from two studies and the first using the CC to determine pattern similarity. We were able to reproduce the EpiGEICAM Western pattern in the DDM-Spain sample (CC=0·90). However, the reproducibility of the Prudent (CC=0·76) and Mediterranean (CC=0·77) patterns was not as good. The linear correlation between pattern scores was statistically significant in all cases, highlighting its arbitrariness for determining pattern similarity. We conclude that the reproducibility of widely prevalent dietary patterns is better than the reproducibility of more population-specific patterns. More methodological studies are needed to establish an objective measurement and threshold to determine pattern similarity.

  19. MASCOT HTML and XML parser: an implementation of a novel object model for protein identification data.

    PubMed

    Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L

    2006-11-01

    Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.

  20. Data-driven optimal binning for respiratory motion management in PET.

    PubMed

    Kesner, Adam L; Meier, Joseph G; Burckhardt, Darrell D; Schwartz, Jazmin; Lynch, David A

    2018-01-01

    Respiratory gating has been used in PET imaging to reduce the amount of image blurring caused by patient motion. Optimal binning is an approach for using the motion-characterized data by binning it into a single, easy to understand/use, optimal bin. To date, optimal binning protocols have utilized externally driven motion characterization strategies that have been tuned with population-derived assumptions and parameters. In this work, we are proposing a new strategy with which to characterize motion directly from a patient's gated scan, and use that signal to create a patient/instance-specific optimal bin image. Two hundred and nineteen phase-gated FDG PET scans, acquired using data-driven gating as described previously, were used as the input for this study. For each scan, a phase-amplitude motion characterization was generated and normalized using principle component analysis. A patient-specific "optimal bin" window was derived using this characterization, via methods that mirror traditional optimal window binning strategies. The resulting optimal bin images were validated by correlating quantitative and qualitative measurements in the population of PET scans. In 53% (n = 115) of the image population, the optimal bin was determined to include 100% of the image statistics. In the remaining images, the optimal binning windows averaged 60% of the statistics and ranged between 20% and 90%. Tuning the algorithm, through a single acceptance window parameter, allowed for adjustments of the algorithm's performance in the population toward conservation of motion or reduced noise-enabling users to incorporate their definition of optimal. In the population of images that were deemed appropriate for segregation, average lesion SUV max were 7.9, 8.5, and 9.0 for nongated images, optimal bin, and gated images, respectively. The Pearson correlation of FWHM measurements between optimal bin images and gated images were better than with nongated images, 0.89 and 0

  1. Network Model-Assisted Inference from Respondent-Driven Sampling Data

    PubMed Central

    Gile, Krista J.; Handcock, Mark S.

    2015-01-01

    Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328

  2. A data-driven prediction method for fast-slow systems

    NASA Astrophysics Data System (ADS)

    Groth, Andreas; Chekroun, Mickael; Kondrashov, Dmitri; Ghil, Michael

    2016-04-01

    In this work, we present a prediction method for processes that exhibit a mixture of variability on low and fast scales. The method relies on combining empirical model reduction (EMR) with singular spectrum analysis (SSA). EMR is a data-driven methodology for constructing stochastic low-dimensional models that account for nonlinearity and serial correlation in the estimated noise, while SSA provides a decomposition of the complex dynamics into low-order components that capture spatio-temporal behavior on different time scales. Our study focuses on the data-driven modeling of partial observations from dynamical systems that exhibit power spectra with broad peaks. The main result in this talk is that the combination of SSA pre-filtering with EMR modeling improves, under certain circumstances, the modeling and prediction skill of such a system, as compared to a standard EMR prediction based on raw data. Specifically, it is the separation into "fast" and "slow" temporal scales by the SSA pre-filtering that achieves the improvement. We show, in particular that the resulting EMR-SSA emulators help predict intermittent behavior such as rapid transitions between specific regions of the system's phase space. This capability of the EMR-SSA prediction will be demonstrated on two low-dimensional models: the Rössler system and a Lotka-Volterra model for interspecies competition. In either case, the chaotic dynamics is produced through a Shilnikov-type mechanism and we argue that the latter seems to be an important ingredient for the good prediction skills of EMR-SSA emulators. Shilnikov-type behavior has been shown to arise in various complex geophysical fluid models, such as baroclinic quasi-geostrophic flows in the mid-latitude atmosphere and wind-driven double-gyre ocean circulation models. This pervasiveness of the Shilnikow mechanism of fast-slow transition opens interesting perspectives for the extension of the proposed EMR-SSA approach to more realistic situations.

  3. Effects of a Data-Driven District-Level Reform Model

    ERIC Educational Resources Information Center

    Slavin, Robert E.; Holmes, GwenCarol; Madden, Nancy A.; Chamberlain, Anne; Cheung, Alan

    2010-01-01

    Despite a quarter-century of reform, US schools serving students in poverty continue to lag far behind other schools. There are proven programs, but these are not widely used. This large-scale experiment evaluated a district-level reform model created by the Center for DataDriven Reform in Education (CDDRE). The CDDRE model provided consultation…

  4. Data-driven Inference and Investigation of Thermosphere Dynamics and Variations

    NASA Astrophysics Data System (ADS)

    Mehta, P. M.; Linares, R.

    2017-12-01

    This paper presents a methodology for data-driven inference and investigation of thermosphere dynamics and variations. The approach uses data-driven modal analysis to extract the most energetic modes of variations for neutral thermospheric species using proper orthogonal decomposition, where the time-independent modes or basis represent the dynamics and the time-depedent coefficients or amplitudes represent the model parameters. The data-driven modal analysis approach combined with sparse, discrete observations is used to infer amplitues for the dynamic modes and to calibrate the energy content of the system. In this work, two different data-types, namely the number density measurements from TIMED/GUVI and the mass density measurements from CHAMP/GRACE are simultaneously ingested for an accurate and self-consistent specification of the thermosphere. The assimilation process is achieved with a non-linear least squares solver and allows estimation/tuning of the model parameters or amplitudes rather than the driver. In this work, we use the Naval Research Lab's MSIS model to derive the most energetic modes for six different species, He, O, N2, O2, H, and N. We examine the dominant drivers of variations for helium in MSIS and observe that seasonal latitudinal variation accounts for about 80% of the dynamic energy with a strong preference of helium for the winter hemisphere. We also observe enhanced helium presence near the poles at GRACE altitudes during periods of low solar activity (Feb 2007) as previously deduced. We will also examine the storm-time response of helium derived from observations. The results are expected to be useful in tuning/calibration of the physics-based models.

  5. Neural dynamics of object-based multifocal visual spatial attention and priming: Object cueing, useful-field-of-view, and crowding

    PubMed Central

    Foley, Nicholas C.; Grossberg, Stephen; Mingolla, Ennio

    2015-01-01

    How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how “attentional shrouds” are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of

  6. Neural dynamics of object-based multifocal visual spatial attention and priming: object cueing, useful-field-of-view, and crowding.

    PubMed

    Foley, Nicholas C; Grossberg, Stephen; Mingolla, Ennio

    2012-08-01

    How are spatial and object attention coordinated to achieve rapid object learning and recognition during eye movement search? How do prefrontal priming and parietal spatial mechanisms interact to determine the reaction time costs of intra-object attention shifts, inter-object attention shifts, and shifts between visible objects and covertly cued locations? What factors underlie individual differences in the timing and frequency of such attentional shifts? How do transient and sustained spatial attentional mechanisms work and interact? How can volition, mediated via the basal ganglia, influence the span of spatial attention? A neural model is developed of how spatial attention in the where cortical stream coordinates view-invariant object category learning in the what cortical stream under free viewing conditions. The model simulates psychological data about the dynamics of covert attention priming and switching requiring multifocal attention without eye movements. The model predicts how "attentional shrouds" are formed when surface representations in cortical area V4 resonate with spatial attention in posterior parietal cortex (PPC) and prefrontal cortex (PFC), while shrouds compete among themselves for dominance. Winning shrouds support invariant object category learning, and active surface-shroud resonances support conscious surface perception and recognition. Attentive competition between multiple objects and cues simulates reaction-time data from the two-object cueing paradigm. The relative strength of sustained surface-driven and fast-transient motion-driven spatial attention controls individual differences in reaction time for invalid cues. Competition between surface-driven attentional shrouds controls individual differences in detection rate of peripheral targets in useful-field-of-view tasks. The model proposes how the strength of competition can be mediated, though learning or momentary changes in volition, by the basal ganglia. A new explanation of

  7. Teacher Perceptions and Use of Data-Driven Instruction: A Qualitative Study

    ERIC Educational Resources Information Center

    Melucci, Laura

    2013-01-01

    The purpose of this study was to determine how teacher perceptions of data and use of data-driven instruction affect student performance in English language arts (ELA). This study investigated teachers' perceptions of using data in the classroom and what supports they need to do so. The goal of the research was to increase the level of knowledge…

  8. A Middle School Principal's and Teachers' Perceptions of Leadership Practices in Data-Driven Decision Making

    ERIC Educational Resources Information Center

    Godreau Cimma, Kelly L.

    2011-01-01

    The purpose of this qualitative case study was to describe one Connecticut middle school's voluntary implementation of a data-driven decision making process in order to improve student academic performance. Data-driven decision making is a component of Connecticut's accountability system to assist schools in meeting the requirements of the No…

  9. A data-driven approach for quality assessment of radiologic interpretations.

    PubMed

    Hsu, William; Han, Simon X; Arnold, Corey W; Bui, Alex At; Enzmann, Dieter R

    2016-04-01

    Given the increasing emphasis on delivering high-quality, cost-efficient healthcare, improved methodologies are needed to measure the accuracy and utility of ordered diagnostic examinations in achieving the appropriate diagnosis. Here, we present a data-driven approach for performing automated quality assessment of radiologic interpretations using other clinical information (e.g., pathology) as a reference standard for individual radiologists, subspecialty sections, imaging modalities, and entire departments. Downstream diagnostic conclusions from the electronic medical record are utilized as "truth" to which upstream diagnoses generated by radiology are compared. The described system automatically extracts and compares patient medical data to characterize concordance between clinical sources. Initial results are presented in the context of breast imaging, matching 18 101 radiologic interpretations with 301 pathology diagnoses and achieving a precision and recall of 84% and 92%, respectively. The presented data-driven method highlights the challenges of integrating multiple data sources and the application of information extraction tools to facilitate healthcare quality improvement. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Data-Driven Decision Making in Practice: The NCAA Injury Surveillance System

    ERIC Educational Resources Information Center

    Klossner, David; Corlette, Jill; Agel, Julie; Marshall, Stephen W.

    2009-01-01

    Putting data-driven decision making into practice requires the use of consistent and reliable data that are easily accessible. The systematic collection and maintenance of accurate information is an important component in developing policy and evaluating outcomes. Since 1982, the National Collegiate Athletic Association (NCAA) has been collecting…

  11. Data driven CAN node reliability assessment for manufacturing system

    NASA Astrophysics Data System (ADS)

    Zhang, Leiming; Yuan, Yong; Lei, Yong

    2017-01-01

    The reliability of the Controller Area Network(CAN) is critical to the performance and safety of the system. However, direct bus-off time assessment tools are lacking in practice due to inaccessibility of the node information and the complexity of the node interactions upon errors. In order to measure the mean time to bus-off(MTTB) of all the nodes, a novel data driven node bus-off time assessment method for CAN network is proposed by directly using network error information. First, the corresponding network error event sequence for each node is constructed using multiple-layer network error information. Then, the generalized zero inflated Poisson process(GZIP) model is established for each node based on the error event sequence. Finally, the stochastic model is constructed to predict the MTTB of the node. The accelerated case studies with different error injection rates are conducted on a laboratory network to demonstrate the proposed method, where the network errors are generated by a computer controlled error injection system. Experiment results show that the MTTB of nodes predicted by the proposed method agree well with observations in the case studies. The proposed data driven node time to bus-off assessment method for CAN networks can successfully predict the MTTB of nodes by directly using network error event data.

  12. A data-driven weighting scheme for multivariate phenotypic endpoints recapitulates zebrafish developmental cascades

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Guozhu, E-mail: gzhang6@ncsu.edu

    Zebrafish have become a key alternative model for studying health effects of environmental stressors, partly due to their genetic similarity to humans, fast generation time, and the efficiency of generating high-dimensional systematic data. Studies aiming to characterize adverse health effects in zebrafish typically include several phenotypic measurements (endpoints). While there is a solid biomedical basis for capturing a comprehensive set of endpoints, making summary judgments regarding health effects requires thoughtful integration across endpoints. Here, we introduce a Bayesian method to quantify the informativeness of 17 distinct zebrafish endpoints as a data-driven weighting scheme for a multi-endpoint summary measure, called weightedmore » Aggregate Entropy (wAggE). We implement wAggE using high-throughput screening (HTS) data from zebrafish exposed to five concentrations of all 1060 ToxCast chemicals. Our results show that our empirical weighting scheme provides better performance in terms of the Receiver Operating Characteristic (ROC) curve for identifying significant morphological effects and improves robustness over traditional curve-fitting approaches. From a biological perspective, our results suggest that developmental cascade effects triggered by chemical exposure can be recapitulated by analyzing the relationships among endpoints. Thus, wAggE offers a powerful approach for analysis of multivariate phenotypes that can reveal underlying etiological processes. - Highlights: • Introduced a data-driven weighting scheme for multiple phenotypic endpoints. • Weighted Aggregate Entropy (wAggE) implies differential importance of endpoints. • Endpoint relationships reveal developmental cascade effects triggered by exposure. • wAggE is generalizable to multi-endpoint data of different shapes and scales.« less

  13. Data-Driven Healthcare: Challenges and Opportunities for Interactive Visualization.

    PubMed

    Gotz, David; Borland, David

    2016-01-01

    The healthcare industry's widespread digitization efforts are reshaping one of the largest sectors of the world's economy. This transformation is enabling systems that promise to use ever-improving data-driven evidence to help doctors make more precise diagnoses, institutions identify at risk patients for intervention, clinicians develop more personalized treatment plans, and researchers better understand medical outcomes within complex patient populations. Given the scale and complexity of the data required to achieve these goals, advanced data visualization tools have the potential to play a critical role. This article reviews a number of visualization challenges unique to the healthcare discipline.

  14. User-driven product data manager system design

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1995-03-01

    With the infusion of information technologies into product development and production processes, effective management of product data is becoming essential to modern production enterprises. When an enterprise-wide Product Data Manager (PDM) is implemented, PDM designers must satisfy the requirements of individual users with different job functions and requirements, as well as the requirements of the enterprise as a whole. Concern must also be shown for the interrelationships between information, methods for retrieving archival information and integration of the PDM into the product development process. This paper describes a user-driven approach applied to PDM design for an agile manufacturing pilot projectmore » at Sandia National Laboratories that has been successful in achieving a much faster design-to-production process for a precision electro mechanical surety device.« less

  15. A case Study of Applying Object-Relational Persistence in Astronomy Data Archiving

    NASA Astrophysics Data System (ADS)

    Yao, S. S.; Hiriart, R.; Barg, I.; Warner, P.; Gasson, D.

    2005-12-01

    The NOAO Science Archive (NSA) team is developing a comprehensive domain model to capture the science data in the archive. Java and an object model derived from the domain model weil address the application layer of the archive system. However, since RDBMS is the best proven technology for data management, the challenge is the paradigm mismatch between the object and the relational models. Transparent object-relational mapping (ORM) persistence is a successful solution to this challenge. In the data modeling and persistence implementation of NSA, we are using Hibernate, a well-accepted ORM tool, to bridge the object model in the business tier and the relational model in the database tier. Thus, the database is isolated from the Java application. The application queries directly on objects using a DBMS-independent object-oriented query API, which frees the application developers from the low level JDBC and SQL so that they can focus on the domain logic. We present the detailed design of the NSA R3 (Release 3) data model and object-relational persistence, including mapping, retrieving and caching. Persistence layer optimization and performance tuning will be analyzed. The system is being built on J2EE, so the integration of Hibernate into the EJB container and the transaction management are also explored.

  16. Integrative Systems Biology for Data Driven Knowledge Discovery

    PubMed Central

    Greene, Casey S.; Troyanskaya, Olga G.

    2015-01-01

    Integrative systems biology is an approach that brings together diverse high throughput experiments and databases to gain new insights into biological processes or systems at molecular through physiological levels. These approaches rely on diverse high-throughput experimental techniques that generate heterogeneous data by assaying varying aspects of complex biological processes. Computational approaches are necessary to provide an integrative view of these experimental results and enable data-driven knowledge discovery. Hypotheses generated from these approaches can direct definitive molecular experiments in a cost effective manner. Using integrative systems biology approaches, we can leverage existing biological knowledge and large-scale data to improve our understanding of yet unknown components of a system of interest and how its malfunction leads to disease. PMID:21044756

  17. Evaluation of respondent-driven sampling.

    PubMed

    McCreesh, Nicky; Frost, Simon D W; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data. Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion. Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required

  18. Evaluation of Respondent-Driven Sampling

    PubMed Central

    McCreesh, Nicky; Frost, Simon; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda Ndagire; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Background Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex-workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total-population data. Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, employing current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). Results We recruited 927 household-heads. Full and small RDS samples were largely representative of the total population, but both samples under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven-sampling statistical-inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven-sampling bootstrap 95% confidence intervals included the population proportion. Conclusions Respondent-driven sampling produced a generally representative sample of this well-connected non-hidden population. However, current respondent-driven-sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience

  19. Out of place, out of mind: Schema-driven false memory effects for object-location bindings.

    PubMed

    Lew, Adina R; Howe, Mark L

    2017-03-01

    Events consist of diverse elements, each processed in specialized neocortical networks, with temporal lobe memory systems binding these elements to form coherent event memories. We provide a novel theoretical analysis of an unexplored consequence of the independence of memory systems for elements and their bindings, 1 that raises the paradoxical prediction that schema-driven false memories can act solely on the binding of event elements despite the superior retrieval of individual elements. This is because if 2, or more, schema-relevant elements are bound together in unexpected conjunctions, the unexpected conjunction will increase attention during encoding to both the elements and their bindings, but only the bindings will receive competition with evoked schema-expected bindings. We test our model by examining memory for object-location bindings in recognition (Study 1) and recall (Studies 2 and 3) tasks. After studying schema-relevant objects in unexpected locations (e.g., pan on a stool in a kitchen scene), participants who then viewed these objects in expected locations (e.g., pan on stove) at test were more likely to falsely remember this object-location pairing as correct, compared with participants that viewed a different unexpected object-location pairing (e.g., pan on floor). In recall, participants were more likely to correctly remember individual schema-relevant objects originally viewed in unexpected, as opposed to expected locations, but were then more likely to misplace these items in the original room scene to expected places, relative to control schema-irrelevant objects. Our theoretical analysis and novel paradigm provide a tool for investigating memory distortions acting on binding processes. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  20. MOPED 2.5—An Integrated Multi-Omics Resource: Multi-Omics Profiling Expression Database Now Includes Transcriptomics Data

    PubMed Central

    Montague, Elizabeth; Stanberry, Larissa; Higdon, Roger; Janko, Imre; Lee, Elaine; Anderson, Nathaniel; Choiniere, John; Stewart, Elizabeth; Yandl, Gregory; Broomall, William; Kolker, Natali

    2014-01-01

    Abstract Multi-omics data-driven scientific discovery crucially rests on high-throughput technologies and data sharing. Currently, data are scattered across single omics repositories, stored in varying raw and processed formats, and are often accompanied by limited or no metadata. The Multi-Omics Profiling Expression Database (MOPED, http://moped.proteinspire.org) version 2.5 is a freely accessible multi-omics expression database. Continual improvement and expansion of MOPED is driven by feedback from the Life Sciences Community. In order to meet the emergent need for an integrated multi-omics data resource, MOPED 2.5 now includes gene relative expression data in addition to protein absolute and relative expression data from over 250 large-scale experiments. To facilitate accurate integration of experiments and increase reproducibility, MOPED provides extensive metadata through the Data-Enabled Life Sciences Alliance (DELSA Global, http://delsaglobal.org) metadata checklist. MOPED 2.5 has greatly increased the number of proteomics absolute and relative expression records to over 500,000, in addition to adding more than four million transcriptomics relative expression records. MOPED has an intuitive user interface with tabs for querying different types of omics expression data and new tools for data visualization. Summary information including expression data, pathway mappings, and direct connection between proteins and genes can be viewed on Protein and Gene Details pages. These connections in MOPED provide a context for multi-omics expression data exploration. Researchers are encouraged to submit omics data which will be consistently processed into expression summaries. MOPED as a multi-omics data resource is a pivotal public database, interdisciplinary knowledge resource, and platform for multi-omics understanding. PMID:24910945

  1. Revising the `Henry Problem' of density-driven groundwater flow: A review of historic Biscayne aquifer data.

    NASA Astrophysics Data System (ADS)

    Weyer, K. U.

    2016-12-01

    Coastal groundwater flow investigations at the Cutler site of the Biscayne Bay south of Miami, Florida, gave rise to the dominating concept of density-driven flow of sea water into coastal aquifers indicated as a saltwater wedge. Within that wedge convection type return flow of seawater and a dispersion zone were concluded by Cooper et al. (1964, USGS Water Supply Paper 1613-C) to be the cause of the Biscayne aquifer `sea water wedge'. This conclusion was merely based on the chloride distribution within the aquifer and on an analytical model concept assuming convection flow within a confined aquifer without taking non-chemical field data into consideration. This concept was later labelled the `Henry Problem', which any numerical variable density flow program has to be able to simulate to be considered acceptable. Revisiting the above summarizing publication with its record of piezometric field data (heads) showed that the so-called sea water wedge was actually caused by discharging deep saline groundwater driven by gravitational flow and not by denser sea water. Density driven flow of seawater into the aquifer was not found reflected in the head measurements for low and high tide conditions which had been taken contemporaneously with the chloride measurements. These head measurements had not been included in the flow interpretation. The very same head measurements indicated a clear dividing line between shallow local fresh groundwater flow and saline deep groundwater flow without the existence of a dispersion zone or a convection cell. The Biscayne situation emphasizes the need for any chemical interpretation of flow pattern to be backed up by head data as energy indicators of flow fields. At the Biscayne site density driven flow of seawater did and does not exist. Instead this site and the Florida coast line in general are the end points of local fresh and regional saline groundwater flow systems driven by gravity forces and not by density differences.

  2. The Role of Community-Driven Data Curation for Enterprises

    NASA Astrophysics Data System (ADS)

    Curry, Edward; Freitas, Andre; O'Riáin, Sean

    With increased utilization of data within their operational and strategic processes, enterprises need to ensure data quality and accuracy. Data curation is a process that can ensure the quality of data and its fitness for use. Traditional approaches to curation are struggling with increased data volumes, and near real-time demands for curated data. In response, curation teams have turned to community crowd-sourcing and semi-automatedmetadata tools for assistance. This chapter provides an overview of data curation, discusses the business motivations for curating data and investigates the role of community-based data curation, focusing on internal communities and pre-competitive data collaborations. The chapter is supported by case studies from Wikipedia, The New York Times, Thomson Reuters, Protein Data Bank and ChemSpider upon which best practices for both social and technical aspects of community-driven data curation are described.

  3. The Role of Guided Induction in Paper-Based Data-Driven Learning

    ERIC Educational Resources Information Center

    Smart, Jonathan

    2014-01-01

    This study examines the role of guided induction as an instructional approach in paper-based data-driven learning (DDL) in the context of an ESL grammar course during an intensive English program at an American public university. Specifically, it examines whether corpus-informed grammar instruction is more effective through inductive, data-driven…

  4. APPLICATION OF DATA QUALITY OBJECTIVES AND MEASUREMENT QUALITY OBJECTIVES TO RESEARCH PROJECTS

    EPA Science Inventory

    The paper assists systematic planning for research projects. It presents planning concepts in terms that have some utility for researchers. For example, measurement quality objectives are more familiar to researchers than data quality objectives because these quality criteria are...

  5. Data-Driven Identification of Risk Factors of Patient Satisfaction at a Large Urban Academic Medical Center.

    PubMed

    Li, Li; Lee, Nathan J; Glicksberg, Benjamin S; Radbill, Brian D; Dudley, Joel T

    2016-01-01

    The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey is the first publicly reported nationwide survey to evaluate and compare hospitals. Increasing patient satisfaction is an important goal as it aims to achieve a more effective and efficient healthcare delivery system. In this study, we develop and apply an integrative, data-driven approach to identify clinical risk factors that associate with patient satisfaction outcomes. We included 1,771 unique adult patients who completed the HCAHPS survey and were discharged from the inpatient Medicine service from 2010 to 2012. We collected 266 clinical features including patient demographics, lab measurements, medications, disease categories, and procedures. We developed and applied a data-driven approach to identify risk factors that associate with patient satisfaction outcomes. We identify 102 significant risk factors associating with 18 surveyed questions. The most significantly recurrent clinical risk factors were: self-evaluation of health, education level, Asian, White, treatment in BMT oncology division, being prescribed a new medication. Patients who were prescribed pregabalin were less satisfied particularly in relation to communication with nurses and pain management. Explanation of medication usage was associated with communication with nurses (q = 0.001); however, explanation of medication side effects was associated with communication with doctors (q = 0.003). Overall hospital rating was associated with hospital environment, communication with doctors, and communication about medicines. However, patient likelihood to recommend hospital was associated with hospital environment, communication about medicines, pain management, and communication with nurse. Our study identified a number of putatively novel clinical risk factors for patient satisfaction that suggest new opportunities to better understand and manage patient satisfaction. Hospitals can use a data-driven approach to

  6. Data-driven Modeling of Metal-oxide Sensors with Dynamic Bayesian Networks

    NASA Astrophysics Data System (ADS)

    Gosangi, Rakesh; Gutierrez-Osuna, Ricardo

    2011-09-01

    We present a data-driven probabilistic framework to model the transient response of MOX sensors modulated with a sequence of voltage steps. Analytical models of MOX sensors are usually built based on the physico-chemical properties of the sensing materials. Although building these models provides an insight into the sensor behavior, they also require a thorough understanding of the underlying operating principles. Here we propose a data-driven approach to characterize the dynamical relationship between sensor inputs and outputs. Namely, we use dynamic Bayesian networks (DBNs), probabilistic models that represent temporal relations between a set of random variables. We identify a set of control variables that influence the sensor responses, create a graphical representation that captures the causal relations between these variables, and finally train the model with experimental data. We validated the approach on experimental data in terms of predictive accuracy and classification performance. Our results show that DBNs can accurately predict the dynamic response of MOX sensors, as well as capture the discriminatory information present in the sensor transients.

  7. qPortal: A platform for data-driven biomedical research.

    PubMed

    Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

    2018-01-01

    Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on

  8. Program objectives for the National Water Data Exchange (NAWDEX) for fiscal year 1978

    USGS Publications Warehouse

    Edwards, Melvin D.

    1977-01-01

    This report presents the program objectives for the National Water Data Exchange (Nawdex) for Fiscal Year 1978, October 1, 1977 to September 30, 1978. Objectives covered include Nawdex mambership, membership participation, Nawdex services, identification of sources of water data, the indexing of water data, systems development and implementation, training, recommended standards for the handling and exchange of water data, and program management. The report provides advance information on Nawdex activities, thereby, allowing the activities to be better integrated into the planning and operation of programs of member organizations. (Woodard-USGS)

  9. Data-Intensive Science meets Inquiry-Driven Pedagogy: Interactive Big Data Exploration, Threshold Concepts, and Liminality

    NASA Technical Reports Server (NTRS)

    Ramachandran, Rahul; Word, Andrea; Nair, Udasysankar

    2014-01-01

    Threshold concepts in any discipline are the core concepts an individual must understand in order to master a discipline. By their very nature, these concepts are troublesome, irreversible, integrative, bounded, discursive, and reconstitutive. Although grasping threshold concepts can be extremely challenging for each learner as s/he moves through stages of cognitive development relative to a given discipline, the learner's grasp of these concepts determines the extent to which s/he is prepared to work competently and creatively within the field itself. The movement of individuals from a state of ignorance of these core concepts to one of mastery occurs not along a linear path but in iterative cycles of knowledge creation and adjustment in liminal spaces - conceptual spaces through which learners move from the vaguest awareness of concepts to mastery, accompanied by understanding of their relevance, connectivity, and usefulness relative to questions and constructs in a given discipline. For example, challenges in the teaching and learning of atmospheric science can be traced to threshold concepts in fluid dynamics. In particular, Dynamic Meteorology is one of the most challenging courses for graduate students and undergraduates majoring in Atmospheric Science. Dynamic Meteorology introduces threshold concepts - those that prove troublesome for the majority of students but that are essential, associated with fundamental relationships between forces and motion in the atmosphere and requiring the application of basic classical statics, dynamics, and thermodynamic principles to the three dimensionally varying atmospheric structure. With the explosive growth of data available in atmospheric science, driven largely by satellite Earth observations and high-resolution numerical simulations, paradigms such as that of dataintensive science have emerged. These paradigm shifts are based on the growing realization that current infrastructure, tools and processes will not allow

  10. Modifying the Sleep Treatment Education Program for Students to include technology use (STEPS-TECH): Intervention effects on objective and subjective sleep outcomes.

    PubMed

    Barber, Larissa K; Cucalon, Maria S

    2017-12-01

    University students often have sleep issues that arise from poor sleep hygiene practices and technology use patterns. Yet, technology-related behaviors are often neglected in sleep hygiene education. This study examined whether the Sleep Treatment Education Program for Students-modified to include information regarding managing technology use (STEPS-TECH)-helps improve both subjective and objective sleep outcomes among university students. Results of an experimental study among 78 university students showed improvements in objective indicators of sleep quantity (total sleep time) and sleep quality (less awakenings) during the subsequent week for students in the STEPS-TECH intervention group compared to a control group. Exploratory analyses indicated that effects were driven by improvements in weekend days immediately following the intervention. There were also no intervention effects on subjective sleep quality or quantity outcomes. In terms of self-reported behavioral responses to educational content in the intervention, there were no group differences in sleep hygiene practices or technology use before bedtime. However, the intervention group reported less technology use during sleep periods than the control group. These preliminary findings suggest that STEPS-TECH may be a useful educational tool to help improve objective sleep and reduce technology use during sleep periods among university students. Copyright © 2017 John Wiley & Sons, Ltd.

  11. Data-Driven Anomaly Detection Performance for the Ares I-X Ground Diagnostic Prototype

    NASA Technical Reports Server (NTRS)

    Martin, Rodney A.; Schwabacher, Mark A.; Matthews, Bryan L.

    2010-01-01

    In this paper, we will assess the performance of a data-driven anomaly detection algorithm, the Inductive Monitoring System (IMS), which can be used to detect simulated Thrust Vector Control (TVC) system failures. However, the ability of IMS to detect these failures in a true operational setting may be related to the realistic nature of how they are simulated. As such, we will investigate both a low fidelity and high fidelity approach to simulating such failures, with the latter based upon the underlying physics. Furthermore, the ability of IMS to detect anomalies that were previously unknown and not previously simulated will be studied in earnest, as well as apparent deficiencies or misapplications that result from using the data-driven paradigm. Our conclusions indicate that robust detection performance of simulated failures using IMS is not appreciably affected by the use of a high fidelity simulation. However, we have found that the inclusion of a data-driven algorithm such as IMS into a suite of deployable health management technologies does add significant value.

  12. Data-driven subtypes of major depressive disorder: a systematic review

    PubMed Central

    2012-01-01

    Background According to current classification systems, patients with major depressive disorder (MDD) may have very different combinations of symptoms. This symptomatic diversity hinders the progress of research into the causal mechanisms and treatment allocation. Theoretically founded subtypes of depression such as atypical, psychotic, and melancholic depression have limited clinical applicability. Data-driven analyses of symptom dimensions or subtypes of depression are scarce. In this systematic review, we examine the evidence for the existence of data-driven symptomatic subtypes of depression. Methods We undertook a systematic literature search of MEDLINE, PsycINFO and Embase in May 2012. We included studies analyzing the depression criteria of the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) of adults with MDD in latent variable analyses. Results In total, 1176 articles were retrieved, of which 20 satisfied the inclusion criteria. These reports described a total of 34 latent variable analyses: 6 confirmatory factor analyses, 6 exploratory factor analyses, 12 principal component analyses, and 10 latent class analyses. The latent class techniques distinguished 2 to 5 classes, which mainly reflected subgroups with different overall severity: 62 of 71 significant differences on symptom level were congruent with a latent class solution reflecting severity. The latent class techniques did not consistently identify specific symptom clusters. Latent factor techniques mostly found a factor explaining the variance in the symptoms depressed mood and interest loss (11 of 13 analyses), often complemented by psychomotor retardation or fatigue (8 of 11 analyses). However, differences in found factors and classes were substantial. Conclusions The studies performed to date do not provide conclusive evidence for the existence of depressive symptom dimensions or symptomatic subtypes. The wide diversity of identified factors and classes might

  13. Pareto-Optimal Multi-objective Inversion of Geophysical Data

    NASA Astrophysics Data System (ADS)

    Schnaidt, Sebastian; Conway, Dennis; Krieger, Lars; Heinson, Graham

    2018-01-01

    In the process of modelling geophysical properties, jointly inverting different data sets can greatly improve model results, provided that the data sets are compatible, i.e., sensitive to similar features. Such a joint inversion requires a relationship between the different data sets, which can either be analytic or structural. Classically, the joint problem is expressed as a scalar objective function that combines the misfit functions of multiple data sets and a joint term which accounts for the assumed connection between the data sets. This approach suffers from two major disadvantages: first, it can be difficult to assess the compatibility of the data sets and second, the aggregation of misfit terms introduces a weighting of the data sets. We present a pareto-optimal multi-objective joint inversion approach based on an existing genetic algorithm. The algorithm treats each data set as a separate objective, avoiding forced weighting and generating curves of the trade-off between the different objectives. These curves are analysed by their shape and evolution to evaluate data set compatibility. Furthermore, the statistical analysis of the generated solution population provides valuable estimates of model uncertainty.

  14. A 3D interactive multi-object segmentation tool using local robust statistics driven active contours.

    PubMed

    Gao, Yi; Kikinis, Ron; Bouix, Sylvain; Shenton, Martha; Tannenbaum, Allen

    2012-08-01

    Extracting anatomical and functional significant structures renders one of the important tasks for both the theoretical study of the medical image analysis, and the clinical and practical community. In the past, much work has been dedicated only to the algorithmic development. Nevertheless, for clinical end users, a well designed algorithm with an interactive software is necessary for an algorithm to be utilized in their daily work. Furthermore, the software would better be open sourced in order to be used and validated by not only the authors but also the entire community. Therefore, the contribution of the present work is twofolds: first, we propose a new robust statistics based conformal metric and the conformal area driven multiple active contour framework, to simultaneously extract multiple targets from MR and CT medical imagery in 3D. Second, an open source graphically interactive 3D segmentation tool based on the aforementioned contour evolution is implemented and is publicly available for end users on multiple platforms. In using this software for the segmentation task, the process is initiated by the user drawn strokes (seeds) in the target region in the image. Then, the local robust statistics are used to describe the object features, and such features are learned adaptively from the seeds under a non-parametric estimation scheme. Subsequently, several active contours evolve simultaneously with their interactions being motivated by the principles of action and reaction-this not only guarantees mutual exclusiveness among the contours, but also no longer relies upon the assumption that the multiple objects fill the entire image domain, which was tacitly or explicitly assumed in many previous works. In doing so, the contours interact and converge to equilibrium at the desired positions of the desired multiple objects. Furthermore, with the aim of not only validating the algorithm and the software, but also demonstrating how the tool is to be used, we provide

  15. A 3D Interactive Multi-object Segmentation Tool using Local Robust Statistics Driven Active Contours

    PubMed Central

    Gao, Yi; Kikinis, Ron; Bouix, Sylvain; Shenton, Martha; Tannenbaum, Allen

    2012-01-01

    Extracting anatomical and functional significant structures renders one of the important tasks for both the theoretical study of the medical image analysis, and the clinical and practical community. In the past, much work has been dedicated only to the algorithmic development. Nevertheless, for clinical end users, a well designed algorithm with an interactive software is necessary for an algorithm to be utilized in their daily work. Furthermore, the software would better be open sourced in order to be used and validated by not only the authors but also the entire community. Therefore, the contribution of the present work is twofolds: First, we propose a new robust statistics based conformal metric and the conformal area driven multiple active contour framework, to simultaneously extract multiple targets from MR and CT medical imagery in 3D. Second, an open source graphically interactive 3D segmentation tool based on the aforementioned contour evolution is implemented and is publicly available for end users on multiple platforms. In using this software for the segmentation task, the process is initiated by the user drawn strokes (seeds) in the target region in the image. Then, the local robust statistics are used to describe the object features, and such features are learned adaptively from the seeds under a non-parametric estimation scheme. Subsequently, several active contours evolve simultaneously with their interactions being motivated by the principles of action and reaction — This not only guarantees mutual exclusiveness among the contours, but also no longer relies upon the assumption that the multiple objects fill the entire image domain, which was tacitly or explicitly assumed in many previous works. In doing so, the contours interact and converge to equilibrium at the desired positions of the desired multiple objects. Furthermore, with the aim of not only validating the algorithm and the software, but also demonstrating how the tool is to be used, we

  16. Systems and Methods for Fabricating Objects Including Amorphous Metal Using Techniques Akin to Additive Manufacturing

    NASA Technical Reports Server (NTRS)

    Hofmann, Douglas (Inventor)

    2017-01-01

    Systems and methods in accordance with embodiments of the invention fabricate objects including amorphous metals using techniques akin to additive manufacturing. In one embodiment, a method of fabricating an object that includes an amorphous metal includes: applying a first layer of molten metallic alloy to a surface; cooling the first layer of molten metallic alloy such that it solidifies and thereby forms a first layer including amorphous metal; subsequently applying at least one layer of molten metallic alloy onto a layer including amorphous metal; cooling each subsequently applied layer of molten metallic alloy such that it solidifies and thereby forms a layer including amorphous metal prior to the application of any adjacent layer of molten metallic alloy; where the aggregate of the solidified layers including amorphous metal forms a desired shape in the object to be fabricated; and removing at least the first layer including amorphous metal from the surface.

  17. A Model-Driven, Science Data Product Registration Service

    NASA Astrophysics Data System (ADS)

    Hardman, S.; Ramirez, P.; Hughes, J. S.; Joyner, R.; Cayanan, M.; Lee, H.; Crichton, D. J.

    2011-12-01

    The Planetary Data System (PDS) has undertaken an effort to overhaul the PDS data architecture (including the data model, data structures, data dictionary, etc.) and to deploy an upgraded software system (including data services, distributed data catalog, etc.) that fully embraces the PDS federation as an integrated system while taking advantage of modern innovations in information technology (including networking capabilities, processing speeds, and software breakthroughs). A core component of this new system is the Registry Service that will provide functionality for tracking, auditing, locating, and maintaining artifacts within the system. These artifacts can range from data files and label files, schemas, dictionary definitions for objects and elements, documents, services, etc. This service offers a single reference implementation of the registry capabilities detailed in the Consultative Committee for Space Data Systems (CCSDS) Registry Reference Model White Book. The CCSDS Reference Model in turn relies heavily on the Electronic Business using eXtensible Markup Language (ebXML) standards for registry services and the registry information model, managed by the OASIS consortium. Registries are pervasive components in most information systems. For example, data dictionaries, service registries, LDAP directory services, and even databases provide registry-like services. These all include an account of informational items that are used in large-scale information systems ranging from data values such as names and codes, to vocabularies, services and software components. The problem is that many of these registry-like services were designed with their own data models associated with the specific type of artifact they track. Additionally these services each have their own specific interface for interacting with the service. This Registry Service implements the data model specified in the ebXML Registry Information Model (RIM) specification that supports the various

  18. Data-driven event-by-event respiratory motion correction using TOF PET list-mode centroid of distribution

    NASA Astrophysics Data System (ADS)

    Ren, Silin; Jin, Xiao; Chan, Chung; Jian, Yiqiang; Mulnix, Tim; Liu, Chi; E Carson, Richard

    2017-06-01

    Data-driven respiratory gating techniques were developed to correct for respiratory motion in PET studies, without the help of external motion tracking systems. Due to the greatly increased image noise in gated reconstructions, it is desirable to develop a data-driven event-by-event respiratory motion correction method. In this study, using the Centroid-of-distribution (COD) algorithm, we established a data-driven event-by-event respiratory motion correction technique using TOF PET list-mode data, and investigated its performance by comparing with an external system-based correction method. Ten human scans with the pancreatic β-cell tracer 18F-FP-(+)-DTBZ were employed. Data-driven respiratory motions in superior-inferior (SI) and anterior-posterior (AP) directions were first determined by computing the centroid of all radioactive events during each short time frame with further processing. The Anzai belt system was employed to record respiratory motion in all studies. COD traces in both SI and AP directions were first compared with Anzai traces by computing the Pearson correlation coefficients. Then, respiratory gated reconstructions based on either COD or Anzai traces were performed to evaluate their relative performance in capturing respiratory motion. Finally, based on correlations of displacements of organ locations in all directions and COD information, continuous 3D internal organ motion in SI and AP directions was calculated based on COD traces to guide event-by-event respiratory motion correction in the MOLAR reconstruction framework. Continuous respiratory correction results based on COD were compared with that based on Anzai, and without motion correction. Data-driven COD traces showed a good correlation with Anzai in both SI and AP directions for the majority of studies, with correlation coefficients ranging from 63% to 89%. Based on the determined respiratory displacements of pancreas between end-expiration and end-inspiration from gated

  19. Data-driven event-by-event respiratory motion correction using TOF PET list-mode centroid of distribution.

    PubMed

    Ren, Silin; Jin, Xiao; Chan, Chung; Jian, Yiqiang; Mulnix, Tim; Liu, Chi; Carson, Richard E

    2017-06-21

    Data-driven respiratory gating techniques were developed to correct for respiratory motion in PET studies, without the help of external motion tracking systems. Due to the greatly increased image noise in gated reconstructions, it is desirable to develop a data-driven event-by-event respiratory motion correction method. In this study, using the Centroid-of-distribution (COD) algorithm, we established a data-driven event-by-event respiratory motion correction technique using TOF PET list-mode data, and investigated its performance by comparing with an external system-based correction method. Ten human scans with the pancreatic β-cell tracer 18 F-FP-(+)-DTBZ were employed. Data-driven respiratory motions in superior-inferior (SI) and anterior-posterior (AP) directions were first determined by computing the centroid of all radioactive events during each short time frame with further processing. The Anzai belt system was employed to record respiratory motion in all studies. COD traces in both SI and AP directions were first compared with Anzai traces by computing the Pearson correlation coefficients. Then, respiratory gated reconstructions based on either COD or Anzai traces were performed to evaluate their relative performance in capturing respiratory motion. Finally, based on correlations of displacements of organ locations in all directions and COD information, continuous 3D internal organ motion in SI and AP directions was calculated based on COD traces to guide event-by-event respiratory motion correction in the MOLAR reconstruction framework. Continuous respiratory correction results based on COD were compared with that based on Anzai, and without motion correction. Data-driven COD traces showed a good correlation with Anzai in both SI and AP directions for the majority of studies, with correlation coefficients ranging from 63% to 89%. Based on the determined respiratory displacements of pancreas between end-expiration and end-inspiration from gated

  20. C++, objected-oriented programming, and astronomical data models

    NASA Technical Reports Server (NTRS)

    Farris, A.

    1992-01-01

    Contemporary astronomy is characterized by increasingly complex instruments and observational techniques, higher data collection rates, and large data archives, placing severe stress on software analysis systems. The object-oriented paradigm represents a significant new approach to software design and implementation that holds great promise for dealing with this increased complexity. The basic concepts of this approach will be characterized in contrast to more traditional procedure-oriented approaches. The fundamental features of objected-oriented programming will be discussed from a C++ programming language perspective, using examples familiar to astronomers. This discussion will focus on objects, classes and their relevance to the data type system; the principle of information hiding; and the use of inheritance to implement generalization/specialization relationships. Drawing on the object-oriented approach, features of a new database model to support astronomical data analysis will be presented.

  1. Spectral types for objects in the Kiso survey. IV - Data for 81 stars

    NASA Technical Reports Server (NTRS)

    Wegner, Gary; Mcmahan, Robert K.

    1988-01-01

    Spectroscopy and spectral types for 81 ultraviolet-excess objects found in the Kiso Schmidt-camera survey are reported. The data were secured with the McGraw-Hill 1.3 m telescope at 8-A resolution covering the wavelength interval 4000 -7200 A using the Mark II spectrograph. Descriptions of the spectra of some of the more peculiar objects found in this sample are given and include 14 sub-dwarfs, 23 definite DA white dwarfs, including a magnetic one, and one DQ whie dwarf, eight quasars and emission-line objects, and a new composite DA + dM system. More spectroscopy of the new cataclysmic variable KUV 01584-0939 and a possibly related object is also described.

  2. Autonomous Soil Assessment System: A Data-Driven Approach to Planetary Mobility Hazard Detection

    NASA Astrophysics Data System (ADS)

    Raimalwala, K.; Faragalli, M.; Reid, E.

    2018-04-01

    The Autonomous Soil Assessment System predicts mobility hazards for rovers. Its development and performance are presented, with focus on its data-driven models, machine learning algorithms, and real-time sensor data fusion for predictive analytics.

  3. Data-Driven Learning: Taking the Computer out of the Equation

    ERIC Educational Resources Information Center

    Boulton, Alex

    2010-01-01

    Despite considerable research interest, data-driven learning (DDL) has not become part of mainstream teaching practice. It may be that technical aspects are too daunting for teachers and students, but there seems to be no reason why DDL in its early stages should not eliminate the computer from the equation by using prepared materials on…

  4. The Use of Linking Adverbials in Academic Essays by Non-Native Writers: How Data-Driven Learning Can Help

    ERIC Educational Resources Information Center

    Garner, James Robert

    2013-01-01

    Over the past several decades, the TESOL community has seen an increased interest in the use of data-driven learning (DDL) approaches. Most studies of DDL have focused on the acquisition of vocabulary items, including a wide range of information necessary for their correct usage. One type of vocabulary that has yet to be properly investigated has…

  5. A Data-Driven Approach to Reverse Engineering Customer Engagement Models: Towards Functional Constructs

    PubMed Central

    de Vries, Natalie Jane; Carlson, Jamie; Moscato, Pablo

    2014-01-01

    Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The ‘communities’ of questionnaire items that emerge from our community detection method form possible ‘functional constructs’ inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such ‘functional constructs’ suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling. PMID:25036766

  6. A data-driven approach to reverse engineering customer engagement models: towards functional constructs.

    PubMed

    de Vries, Natalie Jane; Carlson, Jamie; Moscato, Pablo

    2014-01-01

    Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The 'communities' of questionnaire items that emerge from our community detection method form possible 'functional constructs' inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such 'functional constructs' suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling.

  7. Data-free and data-driven spectral perturbations for RANS UQ

    NASA Astrophysics Data System (ADS)

    Edeling, Wouter; Mishra, Aashwin; Iaccarino, Gianluca

    2017-11-01

    Despite recent developments in high-fidelity turbulent flow simulations, RANS modeling is still vastly used by industry, due to its inherent low cost. Since accuracy is a concern in RANS modeling, model-form UQ is an essential tool for assessing the impacts of this uncertainty on quantities of interest. Applying the spectral decomposition to the modeled Reynolds-Stress Tensor (RST) allows for the introduction of decoupled perturbations into the baseline intensity (kinetic energy), shape (eigenvalues), and orientation (eigenvectors). This constitutes a natural methodology to evaluate the model form uncertainty associated to different aspects of RST modeling. In a predictive setting, one frequently encounters an absence of any relevant reference data. To make data-free predictions with quantified uncertainty we employ physical bounds to a-priori define maximum spectral perturbations. When propagated, these perturbations yield intervals of engineering utility. High-fidelity data opens up the possibility of inferring a distribution of uncertainty, by means of various data-driven machine-learning techniques. We will demonstrate our framework on a number of flow problems where RANS models are prone to failure. This research was partially supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo), and the DOE PSAAP-II program.

  8. Data-driven system to predict academic grades and dropout

    PubMed Central

    Rovira, Sergi; Puertas, Eloi

    2017-01-01

    Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona. PMID:28196078

  9. Data-driven system to predict academic grades and dropout.

    PubMed

    Rovira, Sergi; Puertas, Eloi; Igual, Laura

    2017-01-01

    Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona.

  10. A data driven model for dune morphodynamics

    NASA Astrophysics Data System (ADS)

    Palmsten, M.; Brodie, K.; Spore, N.

    2016-12-01

    Dune morphology results from a number of competing feedbacks between wave, Aeolian, and biologic processes. Only now are conceptual and numerical models for dunes beginning to incorporate all aspects of the processes driving morphodynamics. Drawing on a 35-year record of observations of dune morphology and forcing conditions at the Army Corps of Engineers Field Research Facility (FRF) at Duck, NC, USA, we hypothesize that local dune morphology results from the competition between dune growth during dry windy periods and erosion during storms. We test our hypothesis by developing a data driven model using a Bayesian network to hindcast dune-crest elevation change, dune position change, and shoreline position change. Model inputs include a description of dune morphology from dune-crest elevation, dune-base elevation, dune width, and beach width. Wave forcing and the effect of moisture is parameterized in terms of the maximum total water level and period that waves impact the dunes, along with precipitation. Aeolian forcing is parameterized in terms of maximum wind speed, direction and period that wind exceeds a critical value for sediment transport. We test the sensitivity of our model to forcing parameters and hindcast the 35-year record of dune morphodynamics at the FRF. We also discuss the role of vegetation on dune morphologic differences observed at the FRF.

  11. Data-Driven Model Reduction and Transfer Operator Approximation

    NASA Astrophysics Data System (ADS)

    Klus, Stefan; Nüske, Feliks; Koltai, Péter; Wu, Hao; Kevrekidis, Ioannis; Schütte, Christof; Noé, Frank

    2018-06-01

    In this review paper, we will present different data-driven dimension reduction techniques for dynamical systems that are based on transfer operator theory as well as methods to approximate transfer operators and their eigenvalues, eigenfunctions, and eigenmodes. The goal is to point out similarities and differences between methods developed independently by the dynamical systems, fluid dynamics, and molecular dynamics communities such as time-lagged independent component analysis, dynamic mode decomposition, and their respective generalizations. As a result, extensions and best practices developed for one particular method can be carried over to other related methods.

  12. Data-Driven Learning of Q-Matrix

    PubMed Central

    Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2013-01-01

    The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known Q-matrix, which specifies the item–attribute relationships. This article proposes a data-driven approach to identification of the Q-matrix and estimation of related model parameters. A key ingredient is a flexible T-matrix that relates the Q-matrix to response patterns. The flexibility of the T-matrix allows the construction of a natural criterion function as well as a computationally amenable algorithm. Simulations results are presented to demonstrate usefulness and applicability of the proposed method. Extension to handling of the Q-matrix with partial information is presented. The proposed method also provides a platform on which important statistical issues, such as hypothesis testing and model selection, may be formally addressed. PMID:23926363

  13. Real World Data Driven Evolution of Volvo Cars’ Side Impact Protection Systems and their Effectiveness

    PubMed Central

    Jakobsson, Lotta; Lindman, Magdalena; Svanberg, Bo; Carlsson, Henrik

    2010-01-01

    This study analyses the outcome of the continuous improved occupant protection over the last two decades for front seat near side occupants in side impacts based on a real world driven working process. The effectiveness of four generations of improved side impact protection are calculated based on data from Volvo’s statistical accident database of Volvo Cars in Sweden. Generation I includes vehicles with a new structural and interior concept (SIPS). Generation II includes vehicles with structural improvements and a new chest airbag (SIPSbag). Generation III includes vehicles with further improved SIPS and SIPSbag as well as the new concept with a head protecting Inflatable Curtain (IC). Generation IV includes the most recent vehicles with further improvements of all the systems plus advanced sensors and seat belt pretensioner activation. Compared to baseline vehicles, vehicles of generation I reduce MAIS2+ injuries by 54%, generation II by 61% and generation III by 72%. For generation IV effectiveness figures cannot be calculated because of the lack of MAIS2+ injuries. A continuous improved performance is also seen when studying the AIS2+ pelvis, abdomen, chest and head injuries separately. By using the same real world driven working process, future improvements and possibly new passive as well as active safety systems, will be developed with the aim of further improved protection to near side occupants in side impacts. PMID:21050597

  14. Data-based virtual unmodeled dynamics driven multivariable nonlinear adaptive switching control.

    PubMed

    Chai, Tianyou; Zhang, Yajun; Wang, Hong; Su, Chun-Yi; Sun, Jing

    2011-12-01

    For a complex industrial system, its multivariable and nonlinear nature generally make it very difficult, if not impossible, to obtain an accurate model, especially when the model structure is unknown. The control of this class of complex systems is difficult to handle by the traditional controller designs around their operating points. This paper, however, explores the concepts of controller-driven model and virtual unmodeled dynamics to propose a new design framework. The design consists of two controllers with distinct functions. First, using input and output data, a self-tuning controller is constructed based on a linear controller-driven model. Then the output signals of the controller-driven model are compared with the true outputs of the system to produce so-called virtual unmodeled dynamics. Based on the compensator of the virtual unmodeled dynamics, the second controller based on a nonlinear controller-driven model is proposed. Those two controllers are integrated by an adaptive switching control algorithm to take advantage of their complementary features: one offers stabilization function and another provides improved performance. The conditions on the stability and convergence of the closed-loop system are analyzed. Both simulation and experimental tests on a heavily coupled nonlinear twin-tank system are carried out to confirm the effectiveness of the proposed method.

  15. A data driven nonlinear stochastic model for blood glucose dynamics.

    PubMed

    Zhang, Yan; Holt, Tim A; Khovanova, Natalia

    2016-03-01

    The development of adequate mathematical models for blood glucose dynamics may improve early diagnosis and control of diabetes mellitus (DM). We have developed a stochastic nonlinear second order differential equation to describe the response of blood glucose concentration to food intake using continuous glucose monitoring (CGM) data. A variational Bayesian learning scheme was applied to define the number and values of the system's parameters by iterative optimisation of free energy. The model has the minimal order and number of parameters to successfully describe blood glucose dynamics in people with and without DM. The model accounts for the nonlinearity and stochasticity of the underlying glucose-insulin dynamic process. Being data-driven, it takes full advantage of available CGM data and, at the same time, reflects the intrinsic characteristics of the glucose-insulin system without detailed knowledge of the physiological mechanisms. We have shown that the dynamics of some postprandial blood glucose excursions can be described by a reduced (linear) model, previously seen in the literature. A comprehensive analysis demonstrates that deterministic system parameters belong to different ranges for diabetes and controls. Implications for clinical practice are discussed. This is the first study introducing a continuous data-driven nonlinear stochastic model capable of describing both DM and non-DM profiles. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  16. Adaptive Impact-Driven Detection of Silent Data Corruption for HPC Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Di, Sheng; Cappello, Franck

    For exascale HPC applications, silent data corruption (SDC) is one of the most dangerous problems because there is no indication that there are errors during the execution. We propose an adaptive impact-driven method that can detect SDCs dynamically. The key contributions are threefold. (1) We carefully characterize 18 real-world HPC applications and discuss the runtime data features, as well as the impact of the SDCs on their execution results. (2) We propose an impact-driven detection model that does not blindly improve the prediction accuracy, but instead detects only influential SDCs to guarantee user-acceptable execution results. (3) Our solution can adaptmore » to dynamic prediction errors based on local runtime data and can automatically tune detection ranges for guaranteeing low false alarms. Experiments show that our detector can detect 80-99.99% of SDCs with a false alarm rate less that 1% of iterations for most cases. The memory cost and detection overhead are reduced to 15% and 6.3%, respectively, for a large majority of applications.« less

  17. Data-driven diagnostics of terrestrial carbon dynamics over North America

    Treesearch

    Jingfeng Xiao; Scott V. Ollinger; Steve Frolking; George C. Hurtt; David Y. Hollinger; Kenneth J. Davis; Yude Pan; Xiaoyang Zhang; Feng Deng; Jiquan Chen; Dennis D. Baldocchi; Bevery E. Law; M. Altaf Arain; Ankur R. Desai; Andrew D. Richardson; Ge Sun; Brian Amiro; Hank Margolis; Lianhong Gu; Russell L. Scott; Peter D. Blanken; Andrew E. Suyker

    2014-01-01

    The exchange of carbon dioxide is a key measure of ecosystem metabolism and a critical intersection between the terrestrial biosphere and the Earth's climate. Despite the general agreement that the terrestrial ecosystems in North America provide a sizeable carbon sink, the size and distribution of the sink remain uncertain. We use a data-driven approach to upscale...

  18. Parameterized data-driven fuzzy model based optimal control of a semi-batch reactor.

    PubMed

    Kamesh, Reddi; Rani, K Yamuna

    2016-09-01

    A parameterized data-driven fuzzy (PDDF) model structure is proposed for semi-batch processes, and its application for optimal control is illustrated. The orthonormally parameterized input trajectories, initial states and process parameters are the inputs to the model, which predicts the output trajectories in terms of Fourier coefficients. Fuzzy rules are formulated based on the signs of a linear data-driven model, while the defuzzification step incorporates a linear regression model to shift the domain from input to output domain. The fuzzy model is employed to formulate an optimal control problem for single rate as well as multi-rate systems. Simulation study on a multivariable semi-batch reactor system reveals that the proposed PDDF modeling approach is capable of capturing the nonlinear and time-varying behavior inherent in the semi-batch system fairly accurately, and the results of operating trajectory optimization using the proposed model are found to be comparable to the results obtained using the exact first principles model, and are also found to be comparable to or better than parameterized data-driven artificial neural network model based optimization results. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  19. Data Science and its Relationship to Big Data and Data-Driven Decision Making.

    PubMed

    Provost, Foster; Fawcett, Tom

    2013-03-01

    Companies have realized they need to hire data scientists, academic institutions are scrambling to put together data-science programs, and publications are touting data science as a hot-even "sexy"-career choice. However, there is confusion about what exactly data science is, and this confusion could lead to disillusionment as the concept diffuses into meaningless buzz. In this article, we argue that there are good reasons why it has been hard to pin down exactly what is data science. One reason is that data science is intricately intertwined with other important concepts also of growing importance, such as big data and data-driven decision making. Another reason is the natural tendency to associate what a practitioner does with the definition of the practitioner's field; this can result in overlooking the fundamentals of the field. We believe that trying to define the boundaries of data science precisely is not of the utmost importance. We can debate the boundaries of the field in an academic setting, but in order for data science to serve business effectively, it is important (i) to understand its relationships to other important related concepts, and (ii) to begin to identify the fundamental principles underlying data science. Once we embrace (ii), we can much better understand and explain exactly what data science has to offer. Furthermore, only once we embrace (ii) should we be comfortable calling it data science. In this article, we present a perspective that addresses all these concepts. We close by offering, as examples, a partial list of fundamental principles underlying data science.

  20. Solid object visualization of 3D ultrasound data

    NASA Astrophysics Data System (ADS)

    Nelson, Thomas R.; Bailey, Michael J.

    2000-04-01

    Visualization of volumetric medical data is challenging. Rapid-prototyping (RP) equipment producing solid object prototype models of computer generated structures is directly applicable to visualization of medical anatomic data. The purpose of this study was to develop methods for transferring 3D Ultrasound (3DUS) data to RP equipment for visualization of patient anatomy. 3DUS data were acquired using research and clinical scanning systems. Scaling information was preserved and the data were segmented using threshold and local operators to extract features of interest, converted from voxel raster coordinate format to a set of polygons representing an iso-surface and transferred to the RP machine to create a solid 3D object. Fabrication required 30 to 60 minutes depending on object size and complexity. After creation the model could be touched and viewed. A '3D visualization hardcopy device' has advantages for conveying spatial relations compared to visualization using computer display systems. The hardcopy model may be used for teaching or therapy planning. Objects may be produced at the exact dimension of the original object or scaled up (or down) to facilitate matching the viewers reference frame more optimally. RP models represent a useful means of communicating important information in a tangible fashion to patients and physicians.

  1. Synchronization of autonomous objects in discrete event simulation

    NASA Technical Reports Server (NTRS)

    Rogers, Ralph V.

    1990-01-01

    Autonomous objects in event-driven discrete event simulation offer the potential to combine the freedom of unrestricted movement and positional accuracy through Euclidean space of time-driven models with the computational efficiency of event-driven simulation. The principal challenge to autonomous object implementation is object synchronization. The concept of a spatial blackboard is offered as a potential methodology for synchronization. The issues facing implementation of a spatial blackboard are outlined and discussed.

  2. Model Driven Engineering

    NASA Astrophysics Data System (ADS)

    Gaševic, Dragan; Djuric, Dragan; Devedžic, Vladan

    A relevant initiative from the software engineering community called Model Driven Engineering (MDE) is being developed in parallel with the Semantic Web (Mellor et al. 2003a). The MDE approach to software development suggests that one should first develop a model of the system under study, which is then transformed into the real thing (i.e., an executable software entity). The most important research initiative in this area is the Model Driven Architecture (MDA), which is Model Driven Architecture being developed under the umbrella of the Object Management Group (OMG). This chapter describes the basic concepts of this software engineering effort.

  3. Extraction and classification of 3D objects from volumetric CT data

    NASA Astrophysics Data System (ADS)

    Song, Samuel M.; Kwon, Junghyun; Ely, Austin; Enyeart, John; Johnson, Chad; Lee, Jongkyu; Kim, Namho; Boyd, Douglas P.

    2016-05-01

    We propose an Automatic Threat Detection (ATD) algorithm for Explosive Detection System (EDS) using our multistage Segmentation Carving (SC) followed by Support Vector Machine (SVM) classifier. The multi-stage Segmentation and Carving (SC) step extracts all suspect 3-D objects. The feature vector is then constructed for all extracted objects and the feature vector is classified by the Support Vector Machine (SVM) previously learned using a set of ground truth threat and benign objects. The learned SVM classifier has shown to be effective in classification of different types of threat materials. The proposed ATD algorithm robustly deals with CT data that are prone to artifacts due to scatter, beam hardening as well as other systematic idiosyncrasies of the CT data. Furthermore, the proposed ATD algorithm is amenable for including newly emerging threat materials as well as for accommodating data from newly developing sensor technologies. Efficacy of the proposed ATD algorithm with the SVM classifier is demonstrated by the Receiver Operating Characteristics (ROC) curve that relates Probability of Detection (PD) as a function of Probability of False Alarm (PFA). The tests performed using CT data of passenger bags shows excellent performance characteristics.

  4. An evaluation of data-driven motion estimation in comparison to the usage of external-surrogates in cardiac SPECT imaging

    PubMed Central

    Mukherjee, Joyeeta Mitra; Hutton, Brian F; Johnson, Karen L; Pretorius, P Hendrik; King, Michael A

    2014-01-01

    Motion estimation methods in single photon emission computed tomography (SPECT) can be classified into methods which depend on just the emission data (data-driven), or those that use some other source of information such as an external surrogate. The surrogate-based methods estimate the motion exhibited externally which may not correlate exactly with the movement of organs inside the body. The accuracy of data-driven strategies on the other hand is affected by the type and timing of motion occurrence during acquisition, the source distribution, and various degrading factors such as attenuation, scatter, and system spatial resolution. The goal of this paper is to investigate the performance of two data-driven motion estimation schemes based on the rigid-body registration of projections of motion-transformed source distributions to the acquired projection data for cardiac SPECT studies. Comparison is also made of six intensity based registration metrics to an external surrogate-based method. In the data-driven schemes, a partially reconstructed heart is used as the initial source distribution. The partially-reconstructed heart has inaccuracies due to limited angle artifacts resulting from using only a part of the SPECT projections acquired while the patient maintained the same pose. The performance of different cost functions in quantifying consistency with the SPECT projection data in the data-driven schemes was compared for clinically realistic patient motion occurring as discrete pose changes, one or two times during acquisition. The six intensity-based metrics studied were mean-squared difference (MSD), mutual information (MI), normalized mutual information (NMI), pattern intensity (PI), normalized cross-correlation (NCC) and entropy of the difference (EDI). Quantitative and qualitative analysis of the performance is reported using Monte-Carlo simulations of a realistic heart phantom including degradation factors such as attenuation, scatter and system spatial

  5. Fault Detection for Nonlinear Process With Deterministic Disturbances: A Just-In-Time Learning Based Data Driven Method.

    PubMed

    Yin, Shen; Gao, Huijun; Qiu, Jianbin; Kaynak, Okyay

    2017-11-01

    Data-driven fault detection plays an important role in industrial systems due to its applicability in case of unknown physical models. In fault detection, disturbances must be taken into account as an inherent characteristic of processes. Nevertheless, fault detection for nonlinear processes with deterministic disturbances still receive little attention, especially in data-driven field. To solve this problem, a just-in-time learning-based data-driven (JITL-DD) fault detection method for nonlinear processes with deterministic disturbances is proposed in this paper. JITL-DD employs JITL scheme for process description with local model structures to cope with processes dynamics and nonlinearity. The proposed method provides a data-driven fault detection solution for nonlinear processes with deterministic disturbances, and owns inherent online adaptation and high accuracy of fault detection. Two nonlinear systems, i.e., a numerical example and a sewage treatment process benchmark, are employed to show the effectiveness of the proposed method.

  6. Investigation and Development of Data-Driven D-Region Model for HF Systems Impacts

    NASA Technical Reports Server (NTRS)

    Eccles, J. V.; Rice, D.; Sojka, J. J.; Hunsucker, R. D.

    2002-01-01

    Space Environment Corporation (SEC) and RP Consultants (RPC) are to develop and validate a weather-capable D region model for making High Frequency (HF) absorption predictions in support of the HF communications and radar communities. The weather-capable model will assimilate solar and earth space observations from NASA satellites. The model will account for solar-induced impacts on HF absorption, including X-rays, Solar Proton Events (SPE's), and auroral precipitation. The work plan includes: I . Optimize D-region model to quickly obtain ion and electron densities for proper HF absorption calculations. 2. Develop indices-driven modules for D-region ionization sources for low, mid, & high latitudes including X-rays, cosmic rays, auroral precipitation, & solar protons. (Note: solar spectrum & auroral modules already exist). 3. Setup low-cost monitors of existing HF beacons and add one single-frequency beacon. 4. Use PENEX HF-link database with HF monitor data to validate D-region/HF absorption model using climatological ionization drivers. 5. Develop algorithms to assimilate NASA satellite data of solar, interplanetary, and auroral observations into ionization source modules. 6. Use PENEX HF-link & HF-beacon data for skill score comparison of assimilation versus climatological D-region/HF absorption model. Only some satellites are available for the PENEX time period, thus, HF-beacon data is necessary. 7. Use HF beacon monitors to develop HF-link data assimilation algorithms for regional improvement to the D-region/HF absorption model.

  7. Data-Driven Design of Intelligent Wireless Networks: An Overview and Tutorial.

    PubMed

    Kulin, Merima; Fortuna, Carolina; De Poorter, Eli; Deschrijver, Dirk; Moerman, Ingrid

    2016-06-01

    Data science or "data-driven research" is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves.

  8. Data-Driven Design of Intelligent Wireless Networks: An Overview and Tutorial

    PubMed Central

    Kulin, Merima; Fortuna, Carolina; De Poorter, Eli; Deschrijver, Dirk; Moerman, Ingrid

    2016-01-01

    Data science or “data-driven research” is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i) clarifies when, why and how to use data science in wireless network research; (ii) provides a generic framework for applying data science in wireless networks; (iii) gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv) illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v) provides the reader the necessary datasets and scripts to go through the tutorial steps themselves. PMID:27258286

  9. An object-oriented approach to nested data parallelism

    NASA Technical Reports Server (NTRS)

    Sheffler, Thomas J.; Chatterjee, Siddhartha

    1994-01-01

    This paper describes an implementation technique for integrating nested data parallelism into an object-oriented language. Data-parallel programming employs sets of data called 'collections' and expresses parallelism as operations performed over the elements of a collection. When the elements of a collection are also collections, then there is the possibility for 'nested data parallelism.' Few current programming languages support nested data parallelism however. In an object-oriented framework, a collection is a single object. Its type defines the parallel operations that may be applied to it. Our goal is to design and build an object-oriented data-parallel programming environment supporting nested data parallelism. Our initial approach is built upon three fundamental additions to C++. We add new parallel base types by implementing them as classes, and add a new parallel collection type called a 'vector' that is implemented as a template. Only one new language feature is introduced: the 'foreach' construct, which is the basis for exploiting elementwise parallelism over collections. The strength of the method lies in the compilation strategy, which translates nested data-parallel C++ into ordinary C++. Extracting the potential parallelism in nested 'foreach' constructs is called 'flattening' nested parallelism. We show how to flatten 'foreach' constructs using a simple program transformation. Our prototype system produces vector code which has been successfully run on workstations, a CM-2, and a CM-5.

  10. Toward Data-Driven Design of Educational Courses: A Feasibility Study

    ERIC Educational Resources Information Center

    Agrawal, Rakesh; Golshan, Behzad; Papalexakis, Evangelos

    2016-01-01

    A study plan is the choice of concepts and the organization and sequencing of the concepts to be covered in an educational course. While a good study plan is essential for the success of any course offering, the design of study plans currently remains largely a manual task. We present a novel data-driven method, which given a list of concepts can…

  11. Data-Driven Modeling and Rendering of Force Responses from Elastic Tool Deformation

    PubMed Central

    Rakhmatov, Ruslan; Ogay, Tatyana; Jeon, Seokhee

    2018-01-01

    This article presents a new data-driven model design for rendering force responses from elastic tool deformation. The new design incorporates a six-dimensional input describing the initial position of the contact, as well as the state of the tool deformation. The input-output relationship of the model was represented by a radial basis functions network, which was optimized based on training data collected from real tool-surface contact. Since the input space of the model is represented in the local coordinate system of a tool, the model is independent of recording and rendering devices and can be easily deployed to an existing simulator. The model also supports complex interactions, such as self and multi-contact collisions. In order to assess the proposed data-driven model, we built a custom data acquisition setup and developed a proof-of-concept rendering simulator. The simulator was evaluated through numerical and psychophysical experiments with four different real tools. The numerical evaluation demonstrated the perceptual soundness of the proposed model, meanwhile the user study revealed the force feedback of the proposed simulator to be realistic. PMID:29342964

  12. NextGEOSS project: A user-driven approach to build a Earth Observations Data Hub

    NASA Astrophysics Data System (ADS)

    Percivall, G.; Voidrot, M. F.; Bye, B. L.; De Lathouwer, B.; Catarino, N.; Concalves, P.; Kraft, C.; Grosso, N.; Meyer-Arnek, J.; Mueller, A.; Goor, E.

    2017-12-01

    Several initiatives and projects contribute to support Group on Earth Observation's (GEO) global priorities including support to the UN 2030 Agenda for sustainable development, the Paris Agreement on climate change, and the Sendai Framework for Disaster Risk Reduction . Running until 2020, the NextGEOSS project evolves the European vision of a user driven GEOSS data exploitation for innovation and business, relying on the three main pillars: engaging communities of practice delivering technological advancements advocating the use of GEOSS These 3 pillars support the creation and deployment of Earth observation based innovative research activities and commercial services. In this presentation we will emphasise how the NextGEOSS project uses a pilot-driven approach to ramp up and consolidate the system in a pragmatique way, integrating the complexity of the existing global ecosystem, leveraging previous investments, adding new cloud technologies and resources and engaging the diverse communities to address all types of Sustainable Development Goals (SDGs). A set of 10 initial pilots have been defined by the project partners to address the main challenges and include as soon as possible contributions to SDGs associated with Food Sustainability, Bio Diversity, Space and Security, Cold Regions, Air Pollutions, Disaster Risk Reduction, Territorial Planning, Energy. In 2018 and 2019 the project team will work on two new series of Architecture Implementation Pilots (AIP-10 and AIP-11), opened world-wide, to increase discoverability, accessibility and usability of data with a strong User Centric approach for innovative GEOSS powered applications for multiple societal areas. All initiatives with an interest in and need of Earth observations (data, processes, models, ...) are welcome to participate to these pilots initiatives. NextGEOSS is a H2020 Research and Development Project from the European Community under grant agreement 730329.

  13. Mechanisms of object recognition: what we have learned from pigeons

    PubMed Central

    Soto, Fabian A.; Wasserman, Edward A.

    2014-01-01

    Behavioral studies of object recognition in pigeons have been conducted for 50 years, yielding a large body of data. Recent work has been directed toward synthesizing this evidence and understanding the visual, associative, and cognitive mechanisms that are involved. The outcome is that pigeons are likely to be the non-primate species for which the computational mechanisms of object recognition are best understood. Here, we review this research and suggest that a core set of mechanisms for object recognition might be present in all vertebrates, including pigeons and people, making pigeons an excellent candidate model to study the neural mechanisms of object recognition. Behavioral and computational evidence suggests that error-driven learning participates in object category learning by pigeons and people, and recent neuroscientific research suggests that the basal ganglia, which are homologous in these species, may implement error-driven learning of stimulus-response associations. Furthermore, learning of abstract category representations can be observed in pigeons and other vertebrates. Finally, there is evidence that feedforward visual processing, a central mechanism in models of object recognition in the primate ventral stream, plays a role in object recognition by pigeons. We also highlight differences between pigeons and people in object recognition abilities, and propose candidate adaptive specializations which may explain them, such as holistic face processing and rule-based category learning in primates. From a modern comparative perspective, such specializations are to be expected regardless of the model species under study. The fact that we have a good idea of which aspects of object recognition differ in people and pigeons should be seen as an advantage over other animal models. From this perspective, we suggest that there is much to learn about human object recognition from studying the “simple” brains of pigeons. PMID:25352784

  14. Data-driven parameterization of the generalized Langevin equation

    DOE PAGES

    Lei, Huan; Baker, Nathan A.; Li, Xiantao

    2016-11-29

    We present a data-driven approach to determine the memory kernel and random noise of the generalized Langevin equation. To facilitate practical implementations, we parameterize the kernel function in the Laplace domain by a rational function, with coefficients directly linked to the equilibrium statistics of the coarse-grain variables. Further, we show that such an approximation can be constructed to arbitrarily high order. Within these approximations, the generalized Langevin dynamics can be embedded in an extended stochastic model without memory. We demonstrate how to introduce the stochastic noise so that the fluctuation-dissipation theorem is exactly satisfied.

  15. An object-oriented data reduction system in Fortran

    NASA Technical Reports Server (NTRS)

    Bailey, J.

    1992-01-01

    A data reduction system for the AAO two-degree field project is being developed using an object-oriented approach. Rather than use an object-oriented language (such as C++) the system is written in Fortran and makes extensive use of existing subroutine libraries provided by the UK Starlink project. Objects are created using the extensible N-dimensional Data Format (NDF) which itself is based on the Hierarchical Data System (HDS). The software consists of a class library, with each class corresponding to a Fortran subroutine with a standard calling sequence. The methods of the classes provide operations on NDF objects at a similar level of functionality to the applications of conventional data reduction systems. However, because they are provided as callable subroutines, they can be used as building blocks for more specialist applications. The class library is not dependent on a particular software environment thought it can be used effectively in ADAM applications. It can also be used from standalone Fortran programs. It is intended to develop a graphical user interface for use with the class library to form the 2dF data reduction system.

  16. Ability Grouping and Differentiated Instruction in an Era of Data-Driven Decision Making

    ERIC Educational Resources Information Center

    Park, Vicki; Datnow, Amanda

    2017-01-01

    Despite data-driven decision making being a ubiquitous part of policy and school reform efforts, little is known about how teachers use data for instructional decision making. Drawing on data from a qualitative case study of four elementary schools, we examine the logic and patterns of teacher decision making about differentiation and ability…

  17. Data-driven modeling, control and tools for cyber-physical energy systems

    NASA Astrophysics Data System (ADS)

    Behl, Madhur

    Energy systems are experiencing a gradual but substantial change in moving away from being non-interactive and manually-controlled systems to utilizing tight integration of both cyber (computation, communications, and control) and physical representations guided by first principles based models, at all scales and levels. Furthermore, peak power reduction programs like demand response (DR) are becoming increasingly important as the volatility on the grid continues to increase due to regulation, integration of renewables and extreme weather conditions. In order to shield themselves from the risk of price volatility, end-user electricity consumers must monitor electricity prices and be flexible in the ways they choose to use electricity. This requires the use of control-oriented predictive models of an energy system's dynamics and energy consumption. Such models are needed for understanding and improving the overall energy efficiency and operating costs. However, learning dynamical models using grey/white box approaches is very cost and time prohibitive since it often requires significant financial investments in retrofitting the system with several sensors and hiring domain experts for building the model. We present the use of data-driven methods for making model capture easy and efficient for cyber-physical energy systems. We develop Model-IQ, a methodology for analysis of uncertainty propagation for building inverse modeling and controls. Given a grey-box model structure and real input data from a temporary set of sensors, Model-IQ evaluates the effect of the uncertainty propagation from sensor data to model accuracy and to closed-loop control performance. We also developed a statistical method to quantify the bias in the sensor measurement and to determine near optimal sensor placement and density for accurate data collection for model training and control. Using a real building test-bed, we show how performing an uncertainty analysis can reveal trends about

  18. City Connects Prompts Data-Driven Action in Community Schools in the Bronx

    ERIC Educational Resources Information Center

    Haywoode, Alyssa

    2018-01-01

    Community schools have a long history of helping students succeed in school by addressing the problems they face outside of school. But without specific data on students and the full range of their needs, community schools cannot be as effective as they would like to be. Driven by the desire to make more data-informed decisions, the Children's Aid…

  19. BMI cyberworkstation: enabling dynamic data-driven brain-machine interface research through cyberinfrastructure.

    PubMed

    Zhao, Ming; Rattanatamrong, Prapaporn; DiGiovanna, Jack; Mahmoudi, Babak; Figueiredo, Renato J; Sanchez, Justin C; Príncipe, José C; Fortes, José A B

    2008-01-01

    Dynamic data-driven brain-machine interfaces (DDDBMI) have great potential to advance the understanding of neural systems and improve the design of brain-inspired rehabilitative systems. This paper presents a novel cyberinfrastructure that couples in vivo neurophysiology experimentation with massive computational resources to provide seamless and efficient support of DDDBMI research. Closed-loop experiments can be conducted with in vivo data acquisition, reliable network transfer, parallel model computation, and real-time robot control. Behavioral experiments with live animals are supported with real-time guarantees. Offline studies can be performed with various configurations for extensive analysis and training. A Web-based portal is also provided to allow users to conveniently interact with the cyberinfrastructure, conducting both experimentation and analysis. New motor control models are developed based on this approach, which include recursive least square based (RLS) and reinforcement learning based (RLBMI) algorithms. The results from an online RLBMI experiment shows that the cyberinfrastructure can successfully support DDDBMI experiments and meet the desired real-time requirements.

  20. Effectiveness of User- and Expert-Driven Web-based Hypertension Programs: an RCT.

    PubMed

    Liu, Sam; Brooks, Dina; Thomas, Scott G; Eysenbach, Gunther; Nolan, Robert P

    2018-04-01

    The effectiveness of self-guided Internet-based lifestyle counseling (e-counseling) varies, depending on treatment protocol. Two dominant procedures in e-counseling are expert- and user-driven. The influence of these procedures on hypertension management remains unclear. The objective was to assess whether blood pressure improved with expert-driven or user-driven e-counseling over control intervention in patients with hypertension over a 4-month period. This study used a three-parallel group, double-blind randomized controlled design. In Toronto, Canada, 128 participants (aged 35-74 years) with hypertension were recruited. Participants were recruited using online and poster advertisements. Data collection took place between June 2012 and June 2014. Data were analyzed from October 2014 to December 2016. Controls received a weekly e-mail newsletter regarding hypertension management. The expert-driven group was prescribed a weekly exercise and diet plan (e.g., increase 1,000 steps/day this week). The user-driven group received weekly e-mail, which allowed participants to choose their intervention goals (e.g., [1] feel more confident to change my lifestyle, or [2] self-help tips for exercise or a heart healthy diet). Primary outcome was systolic blood pressure measured at baseline and 4-month follow-up. Secondary outcomes included cholesterol, 10-year Framingham cardiovascular risk, daily steps, and dietary habits. Expert-driven groups showed a greater systolic blood pressure decrease than controls at follow-up (expert-driven versus control: -7.5 mmHg, 95% CI= -12.5, -2.6, p=0.01). Systolic blood pressure reduction did not significantly differ between user- and expert-driven. Expert-driven compared with controls also showed a significant improvement in pulse pressure, cholesterol, and Framingham risk score. The expert-driven intervention was significantly more effective than both user-driven and control groups in increasing daily steps and fruit intake. It may be

  1. Pareto fronts for multiobjective optimization design on materials data

    NASA Astrophysics Data System (ADS)

    Gopakumar, Abhijith; Balachandran, Prasanna; Gubernatis, James E.; Lookman, Turab

    Optimizing multiple properties simultaneously is vital in materials design. Here we apply infor- mation driven, statistical optimization strategies blended with machine learning methods, to address multi-objective optimization tasks on materials data. These strategies aim to find the Pareto front consisting of non-dominated data points from a set of candidate compounds with known character- istics. The objective is to find the pareto front in as few additional measurements or calculations as possible. We show how exploration of the data space to find the front is achieved by using uncer- tainties in predictions from regression models. We test our proposed design strategies on multiple, independent data sets including those from computations as well as experiments. These include data sets for Max phases, piezoelectrics and multicomponent alloys.

  2. Data-driven integration of genome-scale regulatory and metabolic network models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Imam, Saheed; Schauble, Sascha; Brooks, Aaron N.

    Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or moremore » network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. Lastly, in this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.« less

  3. Data-driven integration of genome-scale regulatory and metabolic network models

    DOE PAGES

    Imam, Saheed; Schauble, Sascha; Brooks, Aaron N.; ...

    2015-05-05

    Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or moremore » network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. Lastly, in this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.« less

  4. Mentat: An object-oriented macro data flow system

    NASA Technical Reports Server (NTRS)

    Grimshaw, Andrew S.; Liu, Jane W. S.

    1988-01-01

    Mentat, an object-oriented macro data flow system designed to facilitate parallelism in distributed systems, is presented. The macro data flow model is a model of computation similar to the data flow model with two principal differences: the computational complexity of the actors is much greater than in traditional data flow systems, and there are persistent actors that maintain state information between executions. Mentat is a system that combines the object-oriented programming paradigm and the macro data flow model of computation. Mentat programs use a dynamic structure called a future list to represent the future of computations.

  5. An Overview of the Object Protocol Model (OPM) and the OPM Data Management Tools.

    ERIC Educational Resources Information Center

    Chen, I-Min A.; Markowitz, Victor M.

    1995-01-01

    Discussion of database management tools for scientific information focuses on the Object Protocol Model (OPM) and data management tools based on OPM. Topics include the need for new constructs for modeling scientific experiments, modeling object structures and experiments in OPM, queries and updates, and developing scientific database applications…

  6. Integrating Cognitive Linguistics Insights into Data-Driven Learning: Teaching Vertical Prepositions

    ERIC Educational Resources Information Center

    Kilimci, Abdurrahman

    2017-01-01

    The present study investigates the impact of the integration of the Cognitive Linguistics (CL) pedagogy into Data-driven learning (DDL) on the learners' acquisition of two sets of English spatial prepositions of verticality, "over/under" and "above/below." The study followed a quasi-experimental design with a control and an…

  7. A framework for the automated data-driven constitutive characterization of composites

    Treesearch

    J.G. Michopoulos; John Hermanson; T. Furukawa; A. Iliopoulos

    2010-01-01

    We present advances on the development of a mechatronically and algorithmically automated framework for the data-driven identification of constitutive material models based on energy density considerations. These models can capture both the linear and nonlinear constitutive response of multiaxially loaded composite materials in a manner that accounts for progressive...

  8. Application of Digital Object Identifiers to data sets at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC)

    NASA Astrophysics Data System (ADS)

    Vollmer, B.; Ostrenga, D.; Johnson, J. E.; Savtchenko, A. K.; Shen, S.; Teng, W. L.; Wei, J. C.

    2013-12-01

    Digital Object Identifiers (DOIs) are applied to selected data sets at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). The DOI system provides an Internet resolution service for unique and persistent identifiers of digital objects. Products assigned DOIs include data from the NASA MEaSUREs Program, the Earth Observing System (EOS) Aqua Atmospheric Infrared Sounder (AIRS) and EOS Aura High Resolution Dynamics Limb Sounder (HIRDLS). DOIs are acquired and registered through EZID, California Digital Library and DataCite. GES DISC hosts a data set landing page associated with each DOI containing information on and access to the data including a recommended data citation when using the product in research or applications. This work includes participation with the earth science community (e.g., Earth Science Information Partners (ESIP) Federation) and the NASA Earth Science Data and Information System (ESDIS) Project to identify, establish and implement best practices for assigning DOIs and managing supporting information, including metadata, for earth science data sets. Future work includes (1) coordination with NASA mission Science Teams and other data providers on the assignment of DOIs for other GES DISC data holdings, particularly for future missions such as Orbiting Carbon Observatory -2 and -3 (OCO-2, OCO-3) and projects (MEaSUREs 2012), (2) construction of landing pages that are both human and machine readable, and (3) pursuing the linking of data and publications with tools such as the Thomson Reuters Data Citation Index.

  9. Data-driven modeling reveals cell behaviors controlling self-organization during Myxococcus xanthus development

    PubMed Central

    Cotter, Christopher R.; Schüttler, Heinz-Bernd; Igoshin, Oleg A.; Shimkets, Lawrence J.

    2017-01-01

    Collective cell movement is critical to the emergent properties of many multicellular systems, including microbial self-organization in biofilms, embryogenesis, wound healing, and cancer metastasis. However, even the best-studied systems lack a complete picture of how diverse physical and chemical cues act upon individual cells to ensure coordinated multicellular behavior. Known for its social developmental cycle, the bacterium Myxococcus xanthus uses coordinated movement to generate three-dimensional aggregates called fruiting bodies. Despite extensive progress in identifying genes controlling fruiting body development, cell behaviors and cell–cell communication mechanisms that mediate aggregation are largely unknown. We developed an approach to examine emergent behaviors that couples fluorescent cell tracking with data-driven models. A unique feature of this approach is the ability to identify cell behaviors affecting the observed aggregation dynamics without full knowledge of the underlying biological mechanisms. The fluorescent cell tracking revealed large deviations in the behavior of individual cells. Our modeling method indicated that decreased cell motility inside the aggregates, a biased walk toward aggregate centroids, and alignment among neighboring cells in a radial direction to the nearest aggregate are behaviors that enhance aggregation dynamics. Our modeling method also revealed that aggregation is generally robust to perturbations in these behaviors and identified possible compensatory mechanisms. The resulting approach of directly combining behavior quantification with data-driven simulations can be applied to more complex systems of collective cell movement without prior knowledge of the cellular machinery and behavioral cues. PMID:28533367

  10. Examining Data Driven Decision Making via Formative Assessment: A Confluence of Technology, Data Interpretation Heuristics and Curricular Policy

    ERIC Educational Resources Information Center

    Swan, Gerry; Mazur, Joan

    2011-01-01

    Although the term data-driven decision making (DDDM) is relatively new (Moss, 2007), the underlying concept of DDDM is not. For example, the practices of formative assessment and computer-managed instruction have historically involved the use of student performance data to guide what happens next in the instructional sequence (Morrison, Kemp, &…

  11. The Orion GN and C Data-Driven Flight Software Architecture for Automated Sequencing and Fault Recovery

    NASA Technical Reports Server (NTRS)

    King, Ellis; Hart, Jeremy; Odegard, Ryan

    2010-01-01

    The Orion Crew Exploration Vehicle (CET) is being designed to include significantly more automation capability than either the Space Shuttle or the International Space Station (ISS). In particular, the vehicle flight software has requirements to accommodate increasingly automated missions throughout all phases of flight. A data-driven flight software architecture will provide an evolvable automation capability to sequence through Guidance, Navigation & Control (GN&C) flight software modes and configurations while maintaining the required flexibility and human control over the automation. This flexibility is a key aspect needed to address the maturation of operational concepts, to permit ground and crew operators to gain trust in the system and mitigate unpredictability in human spaceflight. To allow for mission flexibility and reconfrgurability, a data driven approach is being taken to load the mission event plan as well cis the flight software artifacts associated with the GN&C subsystem. A database of GN&C level sequencing data is presented which manages and tracks the mission specific and algorithm parameters to provide a capability to schedule GN&C events within mission segments. The flight software data schema for performing automated mission sequencing is presented with a concept of operations for interactions with ground and onboard crew members. A prototype architecture for fault identification, isolation and recovery interactions with the automation software is presented and discussed as a forward work item.

  12. On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models

    NASA Astrophysics Data System (ADS)

    Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin V.; Zhang, Tuqiao

    2018-02-01

    Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.

  13. Data-driven Applications for the Sun-Earth System

    NASA Astrophysics Data System (ADS)

    Kondrashov, D. A.

    2016-12-01

    Advances in observational and data mining techniques allow extracting information from the large volume of Sun-Earth observational data that can be assimilated into first principles physical models. However, equations governing Sun-Earth phenomena are typically nonlinear, complex, and high-dimensional. The high computational demand of solving the full governing equations over a large range of scales precludes the use of a variety of useful assimilative tools that rely on applied mathematical and statistical techniques for quantifying uncertainty and predictability. Effective use of such tools requires the development of computationally efficient methods to facilitate fusion of data with models. This presentation will provide an overview of various existing as well as newly developed data-driven techniques adopted from atmospheric and oceanic sciences that proved to be useful for space physics applications, such as computationally efficient implementation of Kalman Filter in radiation belts modeling, solar wind gap-filling by Singular Spectrum Analysis, and low-rank procedure for assimilation of low-altitude ionospheric magnetic perturbations into the Lyon-Fedder-Mobarry (LFM) global magnetospheric model. Reduced-order non-Markovian inverse modeling and novel data-adaptive decompositions of Sun-Earth datasets will be also demonstrated.

  14. Dynamic data driven bidirectional reflectance distribution function measurement system

    NASA Astrophysics Data System (ADS)

    Nauyoks, Stephen E.; Freda, Sam; Marciniak, Michael A.

    2014-09-01

    The bidirectional reflectance distribution function (BRDF) is a fitted distribution function that defines the scatter of light off of a surface. The BRDF is dependent on the directions of both the incident and scattered light. Because of the vastness of the measurement space of all possible incident and reflected directions, the calculation of BRDF is usually performed using a minimal amount of measured data. This may lead to poor fits and uncertainty in certain regions of incidence or reflection. A dynamic data driven application system (DDDAS) is a concept that uses an algorithm on collected data to influence the collection space of future data acquisition. The authors propose a DDD-BRDF algorithm that fits BRDF data as it is being acquired and uses on-the-fly fittings of various BRDF models to adjust the potential measurement space. In doing so, it is hoped to find the best model to fit a surface and the best global fit of the BRDF with a minimum amount of collection space.

  15. Retesting the Limits of Data-Driven Learning: Feedback and Error Correction

    ERIC Educational Resources Information Center

    Crosthwaite, Peter

    2017-01-01

    An increasing number of studies have looked at the value of corpus-based data-driven learning (DDL) for second language (L2) written error correction, with generally positive results. However, a potential conundrum for language teachers involved in the process is how to provide feedback on students' written production for DDL. The study looks at…

  16. Neo: an object model for handling electrophysiology data in multiple formats

    PubMed Central

    Garcia, Samuel; Guarino, Domenico; Jaillet, Florent; Jennings, Todd; Pröpper, Robert; Rautenberg, Philipp L.; Rodgers, Chris C.; Sobolev, Andrey; Wachtler, Thomas; Yger, Pierre; Davison, Andrew P.

    2014-01-01

    Neuroscientists use many different software tools to acquire, analyze and visualize electrophysiological signals. However, incompatible data models and file formats make it difficult to exchange data between these tools. This reduces scientific productivity, renders potentially useful analysis methods inaccessible and impedes collaboration between labs. A common representation of the core data would improve interoperability and facilitate data-sharing. To that end, we propose here a language-independent object model, named “Neo,” suitable for representing data acquired from electroencephalographic, intracellular, or extracellular recordings, or generated from simulations. As a concrete instantiation of this object model we have developed an open source implementation in the Python programming language. In addition to representing electrophysiology data in memory for the purposes of analysis and visualization, the Python implementation provides a set of input/output (IO) modules for reading/writing the data from/to a variety of commonly used file formats. Support is included for formats produced by most of the major manufacturers of electrophysiology recording equipment and also for more generic formats such as MATLAB. Data representation and data analysis are conceptually separate: it is easier to write robust analysis code if it is focused on analysis and relies on an underlying package to handle data representation. For that reason, and also to be as lightweight as possible, the Neo object model and the associated Python package are deliberately limited to representation of data, with no functions for data analysis or visualization. Software for neurophysiology data analysis and visualization built on top of Neo automatically gains the benefits of interoperability, easier data sharing and automatic format conversion; there is already a burgeoning ecosystem of such tools. We intend that Neo should become the standard basis for Python tools in neurophysiology

  17. Neo: an object model for handling electrophysiology data in multiple formats.

    PubMed

    Garcia, Samuel; Guarino, Domenico; Jaillet, Florent; Jennings, Todd; Pröpper, Robert; Rautenberg, Philipp L; Rodgers, Chris C; Sobolev, Andrey; Wachtler, Thomas; Yger, Pierre; Davison, Andrew P

    2014-01-01

    Neuroscientists use many different software tools to acquire, analyze and visualize electrophysiological signals. However, incompatible data models and file formats make it difficult to exchange data between these tools. This reduces scientific productivity, renders potentially useful analysis methods inaccessible and impedes collaboration between labs. A common representation of the core data would improve interoperability and facilitate data-sharing. To that end, we propose here a language-independent object model, named "Neo," suitable for representing data acquired from electroencephalographic, intracellular, or extracellular recordings, or generated from simulations. As a concrete instantiation of this object model we have developed an open source implementation in the Python programming language. In addition to representing electrophysiology data in memory for the purposes of analysis and visualization, the Python implementation provides a set of input/output (IO) modules for reading/writing the data from/to a variety of commonly used file formats. Support is included for formats produced by most of the major manufacturers of electrophysiology recording equipment and also for more generic formats such as MATLAB. Data representation and data analysis are conceptually separate: it is easier to write robust analysis code if it is focused on analysis and relies on an underlying package to handle data representation. For that reason, and also to be as lightweight as possible, the Neo object model and the associated Python package are deliberately limited to representation of data, with no functions for data analysis or visualization. Software for neurophysiology data analysis and visualization built on top of Neo automatically gains the benefits of interoperability, easier data sharing and automatic format conversion; there is already a burgeoning ecosystem of such tools. We intend that Neo should become the standard basis for Python tools in neurophysiology.

  18. Double-driven shield capacitive type proximity sensor

    NASA Technical Reports Server (NTRS)

    Vranish, John M. (Inventor)

    1993-01-01

    A capacity type proximity sensor comprised of a capacitance type sensor, a capacitance type reference, and two independent and mutually opposing driven shields respectively adjacent to the sensor and reference and which are coupled in an electrical bridge circuit configuration and driven by a single frequency crystal controlled oscillator is presented. The bridge circuit additionally includes a pair of fixed electrical impedance elements which form adjacent arms of the bridge and which comprise either a pair of precision resistances or capacitors. Detection of bridge unbalance provides an indication of the mutual proximity between an object and the sensor. Drift compensation is also utilized to improve performance and thus increase sensor range and sensitivity.

  19. Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization

    PubMed Central

    Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

    2016-01-01

    In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications. PMID:26771830

  20. Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization.

    PubMed

    Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

    2016-01-01

    In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications.

  1. ClinData Express – A Metadata Driven Clinical Research Data Management System for Secondary Use of Clinical Data

    PubMed Central

    Li, Zuofeng; Wen, Jingran; Zhang, Xiaoyan; Wu, Chunxiao; Li, Zuogao; Liu, Lei

    2012-01-01

    Aim to ease the secondary use of clinical data in clinical research, we introduce a metadata driven web-based clinical data management system named ClinData Express. ClinData Express is made up of two parts: 1) m-designer, a standalone software for metadata definition; 2) a web based data warehouse system for data management. With ClinData Express, what the researchers need to do is to define the metadata and data model in the m-designer. The web interface for data collection and specific database for data storage will be automatically generated. The standards used in the system and the data export modular make sure of the data reuse. The system has been tested on seven disease-data collection in Chinese and one form from dbGap. The flexibility of system makes its great potential usage in clinical research. The system is available at http://code.google.com/p/clindataexpress. PMID:23304327

  2. Two-Dimensional Study of Mass Outflow from Central Gravitational Astrophysical Object. Analytical 2-D solutions for thermo-radiatively driven stellar winds.

    NASA Astrophysics Data System (ADS)

    Kakouris, A.

    \\phi \\propto \\sin\\mu \\theta / R ) where \\mu is a parameter and R the radial distance. Using these assumptions we derive fully analytical (only a Simpson integration is needed) 2-D solutions of four types (with velocity maximum either along the equator or the polar axis of the central astrophysical object). One of them (named as solution in Range I) exhibits suitable features for stellar wind interpretation with velocity maximum along the equator because the outflow starts subsonic at the stellar surface and terminates supersonic at infinity. The other solutions are subsonic (breeze) or they could be examined only as inflows. The Range I solution is applied to real astrophysical objects. Moreover, the thermally driven 2 - D solutions are extended including the radiative force due to the absorption of the stellar light in the fluid. So, the 2-D solutions represent thermally and radiatively driven flows. The assumptions for the radiative force inclusion are that the radiative acceleration is radial and it is a function of radial distance solely (i.e. it is independent of the velocity). The first radiatively driven wind model was presented in 1975 by Castor, Abbott & Klein and was applied to O5f main sequence stars. In order to describe the radiative origin of the massive winds from early and late spectral type stars, the radiative force is separated into its continuum, thick lines and thin lines parts. The mechanism of the continuous absorption is the Thomson scattering of the photons by the free plasma electrons and it is always present. If the line contribution corresponds to the thick absorption spectral lines the model is named as 'thick line driven' otherwise the atmosphere is thought 'optically thin'. In this Thesis we consider an optically thin atmosphere and in this case the radiative force is written as a power law of distance (Chen & Marlborough 1994, Lamers 1986). Moreover, we examine the exponential dependence of the radiative acceleration upon the radial distance

  3. Function-driven discovery of disease genes in zebrafish using an integrated genomics big data resource.

    PubMed

    Shim, Hongseok; Kim, Ji Hyun; Kim, Chan Yeong; Hwang, Sohyun; Kim, Hyojin; Yang, Sunmo; Lee, Ji Eun; Lee, Insuk

    2016-11-16

    Whole exome sequencing (WES) accelerates disease gene discovery using rare genetic variants, but further statistical and functional evidence is required to avoid false-discovery. To complement variant-driven disease gene discovery, here we present function-driven disease gene discovery in zebrafish (Danio rerio), a promising human disease model owing to its high anatomical and genomic similarity to humans. To facilitate zebrafish-based function-driven disease gene discovery, we developed a genome-scale co-functional network of zebrafish genes, DanioNet (www.inetbio.org/danionet), which was constructed by Bayesian integration of genomics big data. Rigorous statistical assessment confirmed the high prediction capacity of DanioNet for a wide variety of human diseases. To demonstrate the feasibility of the function-driven disease gene discovery using DanioNet, we predicted genes for ciliopathies and performed experimental validation for eight candidate genes. We also validated the existence of heterozygous rare variants in the candidate genes of individuals with ciliopathies yet not in controls derived from the UK10K consortium, suggesting that these variants are potentially involved in enhancing the risk of ciliopathies. These results showed that an integrated genomics big data for a model animal of diseases can expand our opportunity for harnessing WES data in disease gene discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. An architecture for a continuous, user-driven, and data-driven application of clinical guidelines and its evaluation.

    PubMed

    Shalom, Erez; Shahar, Yuval; Lunenfeld, Eitan

    2016-02-01

    Design, implement, and evaluate a new architecture for realistic continuous guideline (GL)-based decision support, based on a series of requirements that we have identified, such as support for continuous care, for multiple task types, and for data-driven and user-driven modes. We designed and implemented a new continuous GL-based support architecture, PICARD, which accesses a temporal reasoning engine, and provides several different types of application interfaces. We present the new architecture in detail in the current paper. To evaluate the architecture, we first performed a technical evaluation of the PICARD architecture, using 19 simulated scenarios in the preeclampsia/toxemia domain. We then performed a functional evaluation with the help of two domain experts, by generating patient records that simulate 60 decision points from six clinical guideline-based scenarios, lasting from two days to four weeks. Finally, 36 clinicians made manual decisions in half of the scenarios, and had access to the automated GL-based support in the other half. The measures used in all three experiments were correctness and completeness of the decisions relative to the GL. Mean correctness and completeness in the technical evaluation were 1±0.0 and 0.96±0.03 respectively. The functional evaluation produced only several minor comments from the two experts, mostly regarding the output's style; otherwise the system's recommendations were validated. In the clinically oriented evaluation, the 36 clinicians applied manually approximately 41% of the GL's recommended actions. Completeness increased to approximately 93% when using PICARD. Manual correctness was approximately 94.5%, and remained similar when using PICARD; but while 68% of the manual decisions included correct but redundant actions, only 3% of the actions included in decisions made when using PICARD were redundant. The PICARD architecture is technically feasible and is functionally valid, and addresses the realistic

  5. Towards Data-Driven Simulations of Wildfire Spread using Ensemble-based Data Assimilation

    NASA Astrophysics Data System (ADS)

    Rochoux, M. C.; Bart, J.; Ricci, S. M.; Cuenot, B.; Trouvé, A.; Duchaine, F.; Morel, T.

    2012-12-01

    Real-time predictions of a propagating wildfire remain a challenging task because the problem involves both multi-physics and multi-scales. The propagation speed of wildfires, also called the rate of spread (ROS), is indeed determined by complex interactions between pyrolysis, combustion and flow dynamics, atmospheric dynamics occurring at vegetation, topographical and meteorological scales. Current operational fire spread models are mainly based on a semi-empirical parameterization of the ROS in terms of vegetation, topographical and meteorological properties. For the fire spread simulation to be predictive and compatible with operational applications, the uncertainty on the ROS model should be reduced. As recent progress made in remote sensing technology provides new ways to monitor the fire front position, a promising approach to overcome the difficulties found in wildfire spread simulations is to integrate fire modeling and fire sensing technologies using data assimilation (DA). For this purpose we have developed a prototype data-driven wildfire spread simulator in order to provide optimal estimates of poorly known model parameters [*]. The data-driven simulation capability is adapted for more realistic wildfire spread : it considers a regional-scale fire spread model that is informed by observations of the fire front location. An Ensemble Kalman Filter algorithm (EnKF) based on a parallel computing platform (OpenPALM) was implemented in order to perform a multi-parameter sequential estimation where wind magnitude and direction are in addition to vegetation properties (see attached figure). The EnKF algorithm shows its good ability to track a small-scale grassland fire experiment and ensures a good accounting for the sensitivity of the simulation outcomes to the control parameters. As a conclusion, it was shown that data assimilation is a promising approach to more accurately forecast time-varying wildfire spread conditions as new airborne-like observations of

  6. Deploying Object Oriented Data Technology to the Planetary Data System

    NASA Technical Reports Server (NTRS)

    Kelly, S.; Crichton, D.; Hughes, J. S.

    2003-01-01

    How do you provide more than 350 scientists and researchers access to data from every instrument in Odyssey when the data is curated across half a dozen institutions and in different formats and is too big to mail on a CD-ROM anymore? The Planetary Data System (PDS) faced this exact question. The solution was to use a metadata-based middleware framework developed by the Object Oriented Data Technology task at NASA s Jet Propulsion Laboratory. Using OODT, PDS provided - for the first time ever - data from all mission instruments through a single system immediately upon data delivery.

  7. A Web-Based Data-Querying Tool Based on Ontology-Driven Methodology and Flowchart-Based Model

    PubMed Central

    Ping, Xiao-Ou; Chung, Yufang; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

    2013-01-01

    Background Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. Objective The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. Methods The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. Results In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, “degree of liver damage,” “degree of liver damage

  8. Evaluating data-driven causal inference techniques in noisy physical and ecological systems

    NASA Astrophysics Data System (ADS)

    Tennant, C.; Larsen, L.

    2016-12-01

    Causal inference from observational time series challenges traditional approaches for understanding processes and offers exciting opportunities to gain new understanding of complex systems where nonlinearity, delayed forcing, and emergent behavior are common. We present a formal evaluation of the performance of convergent cross-mapping (CCM) and transfer entropy (TE) for data-driven causal inference under real-world conditions. CCM is based on nonlinear state-space reconstruction, and causality is determined by the convergence of prediction skill with an increasing number of observations of the system. TE is the uncertainty reduction based on transition probabilities of a pair of time-lagged variables. With TE, causal inference is based on asymmetry in information flow between the variables. Observational data and numerical simulations from a number of classical physical and ecological systems: atmospheric convection (the Lorenz system), species competition (patch-tournaments), and long-term climate change (Vostok ice core) were used to evaluate the ability of CCM and TE to infer causal-relationships as data series become increasingly corrupted by observational (instrument-driven) or process (model-or -stochastic-driven) noise. While both techniques show promise for causal inference, TE appears to be applicable to a wider range of systems, especially when the data series are of sufficient length to reliably estimate transition probabilities of system components. Both techniques also show a clear effect of observational noise on causal inference. For example, CCM exhibits a negative logarithmic decline in prediction skill as the noise level of the system increases. Changes in TE strongly depend on noise type and which variable the noise was added to. The ability of CCM and TE to detect driving influences suggest that their application to physical and ecological systems could be transformative for understanding driving mechanisms as Earth systems undergo change.

  9. Exchanging large data object in multi-agent systems

    NASA Astrophysics Data System (ADS)

    Al-Yaseen, Wathiq Laftah; Othman, Zulaiha Ali; Nazri, Mohd Zakree Ahmad

    2016-08-01

    One of the Business Intelligent solutions that is currently in use is the Multi-Agent System (MAS). Communication is one of the most important elements in MAS, especially for exchanging large low level data between distributed agents (physically). The Agent Communication Language in JADE has been offered as a secure method for sending data, whereby the data is defined as an object. However, the object cannot be used to send data to another agent in a different location. Therefore, the aim of this paper was to propose a method for the exchange of large low level data as an object by creating a proxy agent known as a Delivery Agent, which temporarily imitates the Receiver Agent. The results showed that the proposed method is able to send large-sized data. The experiments were conducted using 16 datasets ranging from 100,000 to 7 million instances. However, for the proposed method, the RAM and the CPU machine had to be slightly increased for the Receiver Agent, but the latency time was not significantly different compared to the use of the Java Socket method (non-agent and less secure). With such results, it was concluded that the proposed method can be used to securely send large data between agents.

  10. Data-Driven Neural Network Model for Robust Reconstruction of Automobile Casting

    NASA Astrophysics Data System (ADS)

    Lin, Jinhua; Wang, Yanjie; Li, Xin; Wang, Lu

    2017-09-01

    In computer vision system, it is a challenging task to robustly reconstruct complex 3D geometries of automobile castings. However, 3D scanning data is usually interfered by noises, the scanning resolution is low, these effects normally lead to incomplete matching and drift phenomenon. In order to solve these problems, a data-driven local geometric learning model is proposed to achieve robust reconstruction of automobile casting. In order to relieve the interference of sensor noise and to be compatible with incomplete scanning data, a 3D convolution neural network is established to match the local geometric features of automobile casting. The proposed neural network combines the geometric feature representation with the correlation metric function to robustly match the local correspondence. We use the truncated distance field(TDF) around the key point to represent the 3D surface of casting geometry, so that the model can be directly embedded into the 3D space to learn the geometric feature representation; Finally, the training labels is automatically generated for depth learning based on the existing RGB-D reconstruction algorithm, which accesses to the same global key matching descriptor. The experimental results show that the matching accuracy of our network is 92.2% for automobile castings, the closed loop rate is about 74.0% when the matching tolerance threshold τ is 0.2. The matching descriptors performed well and retained 81.6% matching accuracy at 95% closed loop. For the sparse geometric castings with initial matching failure, the 3D matching object can be reconstructed robustly by training the key descriptors. Our method performs 3D reconstruction robustly for complex automobile castings.

  11. Slic Superpixels for Object Delineation from Uav Data

    NASA Astrophysics Data System (ADS)

    Crommelinck, S.; Bennett, R.; Gerke, M.; Koeva, M. N.; Yang, M. Y.; Vosselman, G.

    2017-08-01

    Unmanned aerial vehicles (UAV) are increasingly investigated with regard to their potential to create and update (cadastral) maps. UAVs provide a flexible and low-cost platform for high-resolution data, from which object outlines can be accurately delineated. This delineation could be automated with image analysis methods to improve existing mapping procedures that are cost, time and labor intensive and of little reproducibility. This study investigates a superpixel approach, namely simple linear iterative clustering (SLIC), in terms of its applicability to UAV data. The approach is investigated in terms of its applicability to high-resolution UAV orthoimages and in terms of its ability to delineate object outlines of roads and roofs. Results show that the approach is applicable to UAV orthoimages of 0.05 m GSD and extents of 100 million and 400 million pixels. Further, the approach delineates the objects with the high accuracy provided by the UAV orthoimages at completeness rates of up to 64 %. The approach is not suitable as a standalone approach for object delineation. However, it shows high potential for a combination with further methods that delineate objects at higher correctness rates in exchange of a lower localization quality. This study provides a basis for future work that will focus on the incorporation of multiple methods for an interactive, comprehensive and accurate object delineation from UAV data. This aims to support numerous application fields such as topographic and cadastral mapping.

  12. Model-driven approach to data collection and reporting for quality improvement.

    PubMed

    Curcin, Vasa; Woodcock, Thomas; Poots, Alan J; Majeed, Azeem; Bell, Derek

    2014-12-01

    Continuous data collection and analysis have been shown essential to achieving improvement in healthcare. However, the data required for local improvement initiatives are often not readily available from hospital Electronic Health Record (EHR) systems or not routinely collected. Furthermore, improvement teams are often restricted in time and funding thus requiring inexpensive and rapid tools to support their work. Hence, the informatics challenge in healthcare local improvement initiatives consists of providing a mechanism for rapid modelling of the local domain by non-informatics experts, including performance metric definitions, and grounded in established improvement techniques. We investigate the feasibility of a model-driven software approach to address this challenge, whereby an improvement model designed by a team is used to automatically generate required electronic data collection instruments and reporting tools. To that goal, we have designed a generic Improvement Data Model (IDM) to capture the data items and quality measures relevant to the project, and constructed Web Improvement Support in Healthcare (WISH), a prototype tool that takes user-generated IDM models and creates a data schema, data collection web interfaces, and a set of live reports, based on Statistical Process Control (SPC) for use by improvement teams. The software has been successfully used in over 50 improvement projects, with more than 700 users. We present in detail the experiences of one of those initiatives, Chronic Obstructive Pulmonary Disease project in Northwest London hospitals. The specific challenges of improvement in healthcare are analysed and the benefits and limitations of the approach are discussed. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  13. Model-driven approach to data collection and reporting for quality improvement

    PubMed Central

    Curcin, Vasa; Woodcock, Thomas; Poots, Alan J.; Majeed, Azeem; Bell, Derek

    2014-01-01

    Continuous data collection and analysis have been shown essential to achieving improvement in healthcare. However, the data required for local improvement initiatives are often not readily available from hospital Electronic Health Record (EHR) systems or not routinely collected. Furthermore, improvement teams are often restricted in time and funding thus requiring inexpensive and rapid tools to support their work. Hence, the informatics challenge in healthcare local improvement initiatives consists of providing a mechanism for rapid modelling of the local domain by non-informatics experts, including performance metric definitions, and grounded in established improvement techniques. We investigate the feasibility of a model-driven software approach to address this challenge, whereby an improvement model designed by a team is used to automatically generate required electronic data collection instruments and reporting tools. To that goal, we have designed a generic Improvement Data Model (IDM) to capture the data items and quality measures relevant to the project, and constructed Web Improvement Support in Healthcare (WISH), a prototype tool that takes user-generated IDM models and creates a data schema, data collection web interfaces, and a set of live reports, based on Statistical Process Control (SPC) for use by improvement teams. The software has been successfully used in over 50 improvement projects, with more than 700 users. We present in detail the experiences of one of those initiatives, Chronic Obstructive Pulmonary Disease project in Northwest London hospitals. The specific challenges of improvement in healthcare are analysed and the benefits and limitations of the approach are discussed. PMID:24874182

  14. Flood probability quantification for road infrastructure: Data-driven spatial-statistical approach and case study applications.

    PubMed

    Kalantari, Zahra; Cavalli, Marco; Cantone, Carolina; Crema, Stefano; Destouni, Georgia

    2017-03-01

    Climate-driven increase in the frequency of extreme hydrological events is expected to impose greater strain on the built environment and major transport infrastructure, such as roads and railways. This study develops a data-driven spatial-statistical approach to quantifying and mapping the probability of flooding at critical road-stream intersection locations, where water flow and sediment transport may accumulate and cause serious road damage. The approach is based on novel integration of key watershed and road characteristics, including also measures of sediment connectivity. The approach is concretely applied to and quantified for two specific study case examples in southwest Sweden, with documented road flooding effects of recorded extreme rainfall. The novel contributions of this study in combining a sediment connectivity account with that of soil type, land use, spatial precipitation-runoff variability and road drainage in catchments, and in extending the connectivity measure use for different types of catchments, improve the accuracy of model results for road flood probability. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Parallel compression of data chunks of a shared data object using a log-structured file system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2016-10-25

    Techniques are provided for parallel compression of data chunks being written to a shared object. A client executing on a compute node or a burst buffer node in a parallel computing system stores a data chunk generated by the parallel computing system to a shared data object on a storage node by compressing the data chunk; and providing the data compressed data chunk to the storage node that stores the shared object. The client and storage node may employ Log-Structured File techniques. The compressed data chunk can be de-compressed by the client when the data chunk is read. A storagemore » node stores a data chunk as part of a shared object by receiving a compressed version of the data chunk from a compute node; and storing the compressed version of the data chunk to the shared data object on the storage node.« less

  16. Data-driven outbreak forecasting with a simple nonlinear growth model

    PubMed Central

    Lega, Joceline; Brown, Heidi E.

    2016-01-01

    Recent events have thrown the spotlight on infectious disease outbreak response. We developed a data-driven method, EpiGro, which can be applied to cumulative case reports to estimate the order of magnitude of the duration, peak and ultimate size of an ongoing outbreak. It is based on a surprisingly simple mathematical property of many epidemiological data sets, does not require knowledge or estimation of disease transmission parameters, is robust to noise and to small data sets, and runs quickly due to its mathematical simplicity. Using data from historic and ongoing epidemics, we present the model. We also provide modeling considerations that justify this approach and discuss its limitations. In the absence of other information or in conjunction with other models, EpiGro may be useful to public health responders. PMID:27770752

  17. A Systematic Review of Published Respondent-Driven Sampling Surveys Collecting Behavioral and Biologic Data.

    PubMed

    Johnston, Lisa G; Hakim, Avi J; Dittrich, Samantha; Burnett, Janet; Kim, Evelyn; White, Richard G

    2016-08-01

    Reporting key details of respondent-driven sampling (RDS) survey implementation and analysis is essential for assessing the quality of RDS surveys. RDS is both a recruitment and analytic method and, as such, it is important to adequately describe both aspects in publications. We extracted data from peer-reviewed literature published through September, 2013 that reported collected biological specimens using RDS. We identified 151 eligible peer-reviewed articles describing 222 surveys conducted in seven regions throughout the world. Most published surveys reported basic implementation information such as survey city, country, year, population sampled, interview method, and final sample size. However, many surveys did not report essential methodological and analytical information for assessing RDS survey quality, including number of recruitment sites, seeds at start and end, maximum number of waves, and whether data were adjusted for network size. Understanding the quality of data collection and analysis in RDS is useful for effectively planning public health service delivery and funding priorities.

  18. Space Object Classification Using Fused Features of Time Series Data

    NASA Astrophysics Data System (ADS)

    Jia, B.; Pham, K. D.; Blasch, E.; Shen, D.; Wang, Z.; Chen, G.

    In this paper, a fused feature vector consisting of raw time series and texture feature information is proposed for space object classification. The time series data includes historical orbit trajectories and asteroid light curves. The texture feature is derived from recurrence plots using Gabor filters for both unsupervised learning and supervised learning algorithms. The simulation results show that the classification algorithms using the fused feature vector achieve better performance than those using raw time series or texture features only.

  19. Co-Constructing Distributed Leadership: District and School Connections in Data-Driven Decision-Making

    ERIC Educational Resources Information Center

    Park, Vicki; Datnow, Amanda

    2009-01-01

    The purpose of this paper is to examine leadership practices in school systems that are implementing data-driven decision-making employing the theory of distributed leadership. With the advent of No Child Left Behind Act of 2001 (NCLB) in the US, educational leaders are now required to analyse, interpret and use data to make informed decisions in…

  20. Creating a System for Data-Driven Decision-Making: Applying the Principal-Agent Framework

    ERIC Educational Resources Information Center

    Wohlstetter, Priscilla; Datnow, Amanda; Park, Vicki

    2008-01-01

    The purpose of this article is to improve our understanding of data-driven decision-making strategies that are initiated at the district or system level. We apply principal-agent theory to the analysis of qualitative data gathered in a case study of 4 urban school systems. Our findings suggest educators at the school level need not only systemic…

  1. Using Flexible Data-Driven Frameworks to Enhance School Psychology Training and Practice

    ERIC Educational Resources Information Center

    Coleman, Stephanie L.; Hendricker, Elise

    2016-01-01

    While a great number of scientific advances have been made in school psychology, the research to practice gap continues to exist, which has significant implications for training future school psychologists. Training in flexible, data-driven models may help school psychology trainees develop important competencies that will benefit them throughout…

  2. Longitudinal aerodynamic characteristics of light, twin-engine, propeller-driven airplanes

    NASA Technical Reports Server (NTRS)

    Wolowicz, C. H.; Yancey, R. B.

    1972-01-01

    Representative state-of-the-art analytical procedures and design data for predicting the longitudinal static and dynamic stability and control characteristics of light, propeller-driven airplanes are presented. Procedures for predicting drag characteristics are also included. The procedures are applied to a twin-engine, propeller-driven airplane in the clean configuration from zero lift to stall conditions. The calculated characteristics are compared with wind-tunnel and flight data. Included in the comparisons are level-flight trim characteristics, period and damping of the short-period oscillatory mode, and windup-turn characteristics. All calculations are documented.

  3. The Distribution and Abundance of Bird Species: Towards a Satellite, Data Driven Avian Energetics and Species Richness Model

    NASA Technical Reports Server (NTRS)

    Smith, James A.

    2003-01-01

    This paper addresses the fundamental question of why birds occur where and when they do, i.e., what are the causative factors that determine the spatio-temporal distributions, abundance, or richness of bird species? In this paper we outline the first steps toward building a satellite, data-driven model of avian energetics and species richness based on individual bird physiology, morphology, and interaction with the spatio-temporal habitat. To evaluate our model, we will use the North American Breeding Bird Survey and Christmas Bird Count data for species richness, wintering and breeding range. Long term and current satellite data series include AVHRR, Landsat, and MODIS.

  4. Diagnostic quality driven physiological data collection for personal healthcare.

    PubMed

    Jea, David; Balani, Rahul; Hsu, Ju-Lan; Cho, Dae-Ki; Gerla, Mario; Srivastava, Mani B

    2008-01-01

    We believe that each individual is unique, and that it is necessary for diagnosis purpose to have a distinctive combination of signals and data features that fits the personal health status. It is essential to develop mechanisms for reducing the amount of data that needs to be transferred (to mitigate the troublesome periodically recharging of a device) while maintaining diagnostic accuracy. Thus, the system should not uniformly compress the collected physiological data, but compress data in a personalized fashion that preserves the 'important' signal features for each individual such that it is enough to make the diagnosis with a required high confidence level. We present a diagnostic quality driven mechanism for remote ECG monitoring, which enables a notation of priorities encoded into the wave segments. The priority is specified by the diagnosis engine or medical experts and is dynamic and individual dependent. The system pre-processes the collected physiological information according to the assigned priority before delivering to the backend server. We demonstrate that the proposed approach provides accurate inference results while effectively compressing the data.

  5. Stochastic dynamics of extended objects in driven systems II: Current quantization in the low-temperature limit

    NASA Astrophysics Data System (ADS)

    Catanzaro, Michael J.; Chernyak, Vladimir Y.; Klein, John R.

    2016-12-01

    Driven Langevin processes have appeared in a variety of fields due to the relevance of natural phenomena having both deterministic and stochastic effects. The stochastic currents and fluxes in these systems provide a convenient set of observables to describe their non-equilibrium steady states. Here we consider stochastic motion of a (k - 1) -dimensional object, which sweeps out a k-dimensional trajectory, and gives rise to a higher k-dimensional current. By employing the low-temperature (low-noise) limit, we reduce the problem to a discrete Markov chain model on a CW complex, a topological construction which generalizes the notion of a graph. This reduction allows the mean fluxes and currents of the process to be expressed in terms of solutions to the discrete Supersymmetric Fokker-Planck (SFP) equation. Taking the adiabatic limit, we show that generic driving leads to rational quantization of the generated higher dimensional current. The latter is achieved by implementing the recently developed tools, coined the higher-dimensional Kirchhoff tree and co-tree theorems. This extends the study of motion of extended objects in the continuous setting performed in the prequel (Catanzaro et al.) to this manuscript.

  6. Recent Data Sets on Object Manipulation: A Survey.

    PubMed

    Huang, Yongqiang; Bianchi, Matteo; Liarokapis, Minas; Sun, Yu

    2016-12-01

    Data sets is crucial not only for model learning and evaluation but also to advance knowledge on human behavior, thus fostering mutual inspiration between neuroscience and robotics. However, choosing the right data set to use or creating a new data set is not an easy task, because of the variety of data that can be found in the related literature. The first step to tackle this issue is to collect and organize those that are available. In this work, we take a significant step forward by reviewing data sets that were published in the past 10 years and that are directly related to object manipulation and grasping. We report on modalities, activities, and annotations for each individual data set and we discuss our view on its use for object manipulation. We also compare the data sets and summarize them. Finally, we conclude the survey by providing suggestions and discussing the best practices for the creation of new data sets.

  7. Objective analysis of observational data from the FGGE observing systems

    NASA Technical Reports Server (NTRS)

    Baker, W.; Edelmann, D.; Iredell, M.; Han, D.; Jakkempudi, S.

    1981-01-01

    An objective analysis procedure for updating the GLAS second and fourth order general atmospheric circulation models using observational data from the first GARP global experiment is described. The objective analysis procedure is based on a successive corrections method and the model is updated in a data assimilation cycle. Preparation of the observational data for analysis and the objective analysis scheme are described. The organization of the program and description of the required data sets are presented. The program logic and detailed descriptions of each subroutine are given.

  8. Large Field Visualization with Demand-Driven Calculation

    NASA Technical Reports Server (NTRS)

    Moran, Patrick J.; Henze, Chris

    1999-01-01

    We present a system designed for the interactive definition and visualization of fields derived from large data sets: the Demand-Driven Visualizer (DDV). The system allows the user to write arbitrary expressions to define new fields, and then apply a variety of visualization techniques to the result. Expressions can include differential operators and numerous other built-in functions, ail of which are evaluated at specific field locations completely on demand. The payoff of following a demand-driven design philosophy throughout becomes particularly evident when working with large time-series data, where the costs of eager evaluation alternatives can be prohibitive.

  9. Scholarly Concentration Program Development: A Generalizable, Data-Driven Approach.

    PubMed

    Burk-Rafel, Jesse; Mullan, Patricia B; Wagenschutz, Heather; Pulst-Korenberg, Alexandra; Skye, Eric; Davis, Matthew M

    2016-11-01

    Scholarly concentration programs-also known as scholarly projects, pathways, tracks, or pursuits-are increasingly common in U.S. medical schools. However, systematic, data-driven program development methods have not been described. The authors examined scholarly concentration programs at U.S. medical schools that U.S. News & World Report ranked as top 25 for research or primary care (n = 43 institutions), coding concentrations and mission statements. Subsequently, the authors conducted a targeted needs assessment via a student-led, institution-wide survey, eliciting learners' preferences for 10 "Pathways" (i.e., concentrations) and 30 "Topics" (i.e., potential content) augmenting core curricula at their institution. Exploratory factor analysis (EFA) and a capacity optimization algorithm characterized best institutional options for learner-focused Pathway development. The authors identified scholarly concentration programs at 32 of 43 medical schools (74%), comprising 199 distinct concentrations (mean concentrations per program: 6.2, mode: 5, range: 1-16). Thematic analysis identified 10 content domains; most common were "Global/Public Health" (30 institutions; 94%) and "Clinical/Translational Research" (26 institutions; 81%). The institutional needs assessment (n = 468 medical students; response rate 60% overall, 97% among first-year students) demonstrated myriad student preferences for Pathways and Topics. EFA of Topic preferences identified eight factors, systematically related to Pathway preferences, informing content development. Capacity modeling indicated that offering six Pathways could guarantee 95% of first-year students (162/171) their first- or second-choice Pathway. This study demonstrates a generalizable, data-driven approach to scholarly concentration program development that reflects student preferences and institutional strengths, while optimizing program diversity within capacity constraints.

  10. Unlearning Overgenerated "Be" through Data-Driven Learning in the Secondary EFL Classroom

    ERIC Educational Resources Information Center

    Moon, Soyeon; Oh, Sun-Young

    2018-01-01

    This paper reports on the cognitive and affective benefits of data-driven learning (DDL), in which Korean EFL learners at the secondary level notice and unlearn their "overgenerated 'be'" by comparing native English-speaker and learner corpora with guided induction. To select the target language item and compile learner-corpus-based…

  11. Object-based classification of global undersea topography and geomorphological features from the SRTM30_PLUS data

    NASA Astrophysics Data System (ADS)

    Dekavalla, Maria; Argialas, Demetre

    2017-07-01

    The analysis of undersea topography and geomorphological features provides necessary information to related disciplines and many applications. The development of an automated knowledge-based classification approach of undersea topography and geomorphological features is challenging due to their multi-scale nature. The aim of the study is to develop and evaluate an automated knowledge-based OBIA approach to: i) decompose the global undersea topography to multi-scale regions of distinct morphometric properties, and ii) assign the derived regions to characteristic geomorphological features. First, the global undersea topography was decomposed through the SRTM30_PLUS bathymetry data to the so-called morphometric objects of discrete morphometric properties and spatial scales defined by data-driven methods (local variance graphs and nested means) and multi-scale analysis. The derived morphometric objects were combined with additional relative topographic position information computed with a self-adaptive pattern recognition method (geomorphons), and auxiliary data and were assigned to characteristic undersea geomorphological feature classes through a knowledge base, developed from standard definitions. The decomposition of the SRTM30_PLUS data to morphometric objects was considered successful for the requirements of maximizing intra-object and inter-object heterogeneity, based on the near zero values of the Moran's I and the low values of the weighted variance index. The knowledge-based classification approach was tested for its transferability in six case studies of various tectonic settings and achieved the efficient extraction of 11 undersea geomorphological feature classes. The classification results for the six case studies were compared with the digital global seafloor geomorphic features map (GSFM). The 11 undersea feature classes and their producer's accuracies in respect to the GSFM relevant areas were Basin (95%), Continental Shelf (94.9%), Trough (88

  12. Haptic Classification of Common Objects: Knowledge-Driven Exploration.

    ERIC Educational Resources Information Center

    Lederman, Susan J.; Klatzky, Roberta L.

    1990-01-01

    Theoretical and empirical issues relating to haptic exploration and the representation of common objects during haptic classification were investigated in 3 experiments involving a total of 112 college students. Results are discussed in terms of a computational model of human haptic object classification with implications for dextrous robot…

  13. Simple proteomics data analysis in the object-oriented PowerShell.

    PubMed

    Mohammed, Yassene; Palmblad, Magnus

    2013-01-01

    Scripting languages such as Perl and Python are appreciated for solving simple, everyday tasks in bioinformatics. A more recent, object-oriented command shell and scripting language, Windows PowerShell, has many attractive features: an object-oriented interactive command line, fluent navigation and manipulation of XML files, ability to consume Web services from the command line, consistent syntax and grammar, rich regular expressions, and advanced output formatting. The key difference between classical command shells and scripting languages, such as bash, and object-oriented ones, such as PowerShell, is that in the latter the result of a command is a structured object with inherited properties and methods rather than a simple stream of characters. Conveniently, PowerShell is included in all new releases of Microsoft Windows and therefore already installed on most computers in classrooms and teaching labs. In this chapter we demonstrate how PowerShell in particular allows easy interaction with mass spectrometry data in XML formats, connection to Web services for tools such as BLAST, and presentation of results as formatted text or graphics. These features make PowerShell much more than "yet another scripting language."

  14. Data Driven Professional Development Design for Out-of-School Time Educators Using Planetary Science and Engineering Educational Materials

    NASA Astrophysics Data System (ADS)

    Clark, J.; Bloom, N.

    2017-12-01

    Data driven design practices should be the basis for any effective educational product, particularly those used to support STEM learning and literacy. Planetary Learning that Advances the Nexus of Engineering, Technology, and Science (PLANETS) is a five-year NASA-funded (NNX16AC53A) interdisciplinary and cross-institutional partnership to develop and disseminate STEM out-of-school time (OST) curricular and professional development units that integrate planetary science, technology, and engineering. The Center for Science Teaching and Learning at Northern Arizona University, the U.S. Geological Survey Astrogeology Science Center, and the Museum of Science Boston are partners in developing, piloting, and researching the impact of three out of school time units. Two units are for middle grades youth and one is for upper elementary aged youth. The presentation will highlight the data driven development process of the educational products used to provide support for educators teaching these curriculum units. This includes how data from the project needs assessment, curriculum pilot testing, and professional support product field tests are used in the design of products for out of school time educators. Based on data analysis, the project is developing and testing four tiers of professional support for OST educators. Tier 1 meets the immediate needs of OST educators to teach curriculum and include how-to videos and other direct support materials. Tier 2 provides additional content and pedagogical knowledge and includes short content videos designed to specifically address the content of the curriculum. Tier 3 elaborates on best practices in education and gives guidance on methods, for example, to develop cultural relevancy for underrepresented students. Tier 4 helps make connections to other NASA or educational products that support STEM learning in out of school settings. Examples of the tiers of support will be provided.

  15. The Financial and Non-Financial Aspects of Developing a Data-Driven Decision-Making Mindset in an Undergraduate Business Curriculum

    ERIC Educational Resources Information Center

    Bohler, Jeffrey; Krishnamoorthy, Anand; Larson, Benjamin

    2017-01-01

    Making data-driven decisions is becoming more important for organizations faced with confusing and often contradictory information available to them from their operating environment. This article examines one college of business' journey of developing a data-driven decision-making mindset within its undergraduate curriculum. Lessons learned may be…

  16. Data-driven outbreak forecasting with a simple nonlinear growth model.

    PubMed

    Lega, Joceline; Brown, Heidi E

    2016-12-01

    Recent events have thrown the spotlight on infectious disease outbreak response. We developed a data-driven method, EpiGro, which can be applied to cumulative case reports to estimate the order of magnitude of the duration, peak and ultimate size of an ongoing outbreak. It is based on a surprisingly simple mathematical property of many epidemiological data sets, does not require knowledge or estimation of disease transmission parameters, is robust to noise and to small data sets, and runs quickly due to its mathematical simplicity. Using data from historic and ongoing epidemics, we present the model. We also provide modeling considerations that justify this approach and discuss its limitations. In the absence of other information or in conjunction with other models, EpiGro may be useful to public health responders. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  17. Characterizing the (Perceived) Newsworthiness of Health Science Articles: A Data-Driven Approach

    PubMed Central

    Willis, Erin; Paul, Michael J; Elhadad, Noémie; Wallace, Byron C

    2016-01-01

    Background Health science findings are primarily disseminated through manuscript publications. Information subsidies are used to communicate newsworthy findings to journalists in an effort to earn mass media coverage and further disseminate health science research to mass audiences. Journal editors and news journalists then select which news stories receive coverage and thus public attention. Objective This study aims to identify attributes of published health science articles that correlate with (1) journal editor issuance of press releases and (2) mainstream media coverage. Methods We constructed four novel datasets to identify factors that correlate with press release issuance and media coverage. These corpora include thousands of published articles, subsets of which received press release or mainstream media coverage. We used statistical machine learning methods to identify correlations between words in the science abstracts and press release issuance and media coverage. Further, we used a topic modeling-based machine learning approach to uncover latent topics predictive of the perceived newsworthiness of science articles. Results Both press release issuance for, and media coverage of, health science articles are predictable from corresponding journal article content. For the former task, we achieved average areas under the curve (AUCs) of 0.666 (SD 0.019) and 0.882 (SD 0.018) on two separate datasets, comprising 3024 and 10,760 articles, respectively. For the latter task, models realized mean AUCs of 0.591 (SD 0.044) and 0.783 (SD 0.022) on two datasets—in this case containing 422 and 28,910 pairs, respectively. We reported most-predictive words and topics for press release or news coverage. Conclusions We have presented a novel data-driven characterization of content that renders health science “newsworthy.” The analysis provides new insights into the news coverage selection process. For example, it appears epidemiological papers concerning common

  18. Global retrieval of soil moisture and vegetation properties using data-driven methods

    NASA Astrophysics Data System (ADS)

    Rodriguez-Fernandez, Nemesio; Richaume, Philippe; Kerr, Yann

    2017-04-01

    Data-driven methods such as neural networks (NNs) are a powerful tool to retrieve soil moisture from multi-wavelength remote sensing observations at global scale. In this presentation we will review a number of recent results regarding the retrieval of soil moisture with the Soil Moisture and Ocean Salinity (SMOS) satellite, either using SMOS brightness temperatures as input data for the retrieval or using SMOS soil moisture retrievals as reference dataset for the training. The presentation will discuss several possibilities for both the input datasets and the datasets to be used as reference for the supervised learning phase. Regarding the input datasets, it will be shown that NNs take advantage of the synergy of SMOS data and data from other sensors such as the Advanced Scatterometer (ASCAT, active microwaves) and MODIS (visible and infra red). NNs have also been successfully used to construct long time series of soil moisture from the Advanced Microwave Scanning Radiometer - Earth Observing System (AMSR-E) and SMOS. A NN with input data from ASMR-E observations and SMOS soil moisture as reference for the training was used to construct a dataset sharing a similar climatology and without a significant bias with respect to SMOS soil moisture. Regarding the reference data to train the data-driven retrievals, we will show different possibilities depending on the application. Using actual in situ measurements is challenging at global scale due to the scarce distribution of sensors. In contrast, in situ measurements have been successfully used to retrieve SM at continental scale in North America, where the density of in situ measurement stations is high. Using global land surface models to train the NN constitute an interesting alternative to implement new remote sensing surface datasets. In addition, these datasets can be used to perform data assimilation into the model used as reference for the training. This approach has recently been tested at the European Centre

  19. Big-Data-Driven Stem Cell Science and Tissue Engineering: Vision and Unique Opportunities.

    PubMed

    Del Sol, Antonio; Thiesen, Hans J; Imitola, Jaime; Carazo Salas, Rafael E

    2017-02-02

    Achieving the promises of stem cell science to generate precise disease models and designer cell samples for personalized therapeutics will require harnessing pheno-genotypic cell-level data quantitatively and predictively in the lab and clinic. Those requirements could be met by developing a Big-Data-driven stem cell science strategy and community. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Testing the Accuracy of Data-driven MHD Simulations of Active Region Evolution and Eruption

    NASA Astrophysics Data System (ADS)

    Leake, J. E.; Linton, M.; Schuck, P. W.

    2017-12-01

    Models for the evolution of the solar coronal magnetic field are vital for understanding solar activity, yet the best measurements of the magnetic field lie at the photosphere, necessitating the recent development of coronal models which are "data-driven" at the photosphere. Using magnetohydrodynamic simulations of active region formation and our recently created validation framework we investigate the source of errors in data-driven models that use surface measurements of the magnetic field, and derived MHD quantities, to model the coronal magnetic field. The primary sources of errors in these studies are the temporal and spatial resolution of the surface measurements. We will discuss the implications of theses studies for accurately modeling the build up and release of coronal magnetic energy based on photospheric magnetic field observations.

  1. Data-Driven User Feedback: An Improved Neurofeedback Strategy considering the Interindividual Variability of EEG Features.

    PubMed

    Han, Chang-Hee; Lim, Jeong-Hwan; Lee, Jun-Hak; Kim, Kangsan; Im, Chang-Hwan

    2016-01-01

    It has frequently been reported that some users of conventional neurofeedback systems can experience only a small portion of the total feedback range due to the large interindividual variability of EEG features. In this study, we proposed a data-driven neurofeedback strategy considering the individual variability of electroencephalography (EEG) features to permit users of the neurofeedback system to experience a wider range of auditory or visual feedback without a customization process. The main idea of the proposed strategy is to adjust the ranges of each feedback level using the density in the offline EEG database acquired from a group of individuals. Twenty-two healthy subjects participated in offline experiments to construct an EEG database, and five subjects participated in online experiments to validate the performance of the proposed data-driven user feedback strategy. Using the optimized bin sizes, the number of feedback levels that each individual experienced was significantly increased to 139% and 144% of the original results with uniform bin sizes in the offline and online experiments, respectively. Our results demonstrated that the use of our data-driven neurofeedback strategy could effectively increase the overall range of feedback levels that each individual experienced during neurofeedback training.

  2. Data-Driven User Feedback: An Improved Neurofeedback Strategy considering the Interindividual Variability of EEG Features

    PubMed Central

    Lim, Jeong-Hwan; Lee, Jun-Hak; Kim, Kangsan

    2016-01-01

    It has frequently been reported that some users of conventional neurofeedback systems can experience only a small portion of the total feedback range due to the large interindividual variability of EEG features. In this study, we proposed a data-driven neurofeedback strategy considering the individual variability of electroencephalography (EEG) features to permit users of the neurofeedback system to experience a wider range of auditory or visual feedback without a customization process. The main idea of the proposed strategy is to adjust the ranges of each feedback level using the density in the offline EEG database acquired from a group of individuals. Twenty-two healthy subjects participated in offline experiments to construct an EEG database, and five subjects participated in online experiments to validate the performance of the proposed data-driven user feedback strategy. Using the optimized bin sizes, the number of feedback levels that each individual experienced was significantly increased to 139% and 144% of the original results with uniform bin sizes in the offline and online experiments, respectively. Our results demonstrated that the use of our data-driven neurofeedback strategy could effectively increase the overall range of feedback levels that each individual experienced during neurofeedback training. PMID:27631005

  3. Data-Driven Sampling Matrix Boolean Optimization for Energy-Efficient Biomedical Signal Acquisition by Compressive Sensing.

    PubMed

    Wang, Yuhao; Li, Xin; Xu, Kai; Ren, Fengbo; Yu, Hao

    2017-04-01

    Compressive sensing is widely used in biomedical applications, and the sampling matrix plays a critical role on both quality and power consumption of signal acquisition. It projects a high-dimensional vector of data into a low-dimensional subspace by matrix-vector multiplication. An optimal sampling matrix can ensure accurate data reconstruction and/or high compression ratio. Most existing optimization methods can only produce real-valued embedding matrices that result in large energy consumption during data acquisition. In this paper, we propose an efficient method that finds an optimal Boolean sampling matrix in order to reduce the energy consumption. Compared to random Boolean embedding, our data-driven Boolean sampling matrix can improve the image recovery quality by 9 dB. Moreover, in terms of sampling hardware complexity, it reduces the energy consumption by 4.6× and the silicon area by 1.9× over the data-driven real-valued embedding.

  4. Data-driven approach for creating synthetic electronic medical records.

    PubMed

    Buczak, Anna L; Babin, Steven; Moniz, Linda

    2010-10-14

    New algorithms for disease outbreak detection are being developed to take advantage of full electronic medical records (EMRs) that contain a wealth of patient information. However, due to privacy concerns, even anonymized EMRs cannot be shared among researchers, resulting in great difficulty in comparing the effectiveness of these algorithms. To bridge the gap between novel bio-surveillance algorithms operating on full EMRs and the lack of non-identifiable EMR data, a method for generating complete and synthetic EMRs was developed. This paper describes a novel methodology for generating complete synthetic EMRs both for an outbreak illness of interest (tularemia) and for background records. The method developed has three major steps: 1) synthetic patient identity and basic information generation; 2) identification of care patterns that the synthetic patients would receive based on the information present in real EMR data for similar health problems; 3) adaptation of these care patterns to the synthetic patient population. We generated EMRs, including visit records, clinical activity, laboratory orders/results and radiology orders/results for 203 synthetic tularemia outbreak patients. Validation of the records by a medical expert revealed problems in 19% of the records; these were subsequently corrected. We also generated background EMRs for over 3000 patients in the 4-11 yr age group. Validation of those records by a medical expert revealed problems in fewer than 3% of these background patient EMRs and the errors were subsequently rectified. A data-driven method was developed for generating fully synthetic EMRs. The method is general and can be applied to any data set that has similar data elements (such as laboratory and radiology orders and results, clinical activity, prescription orders). The pilot synthetic outbreak records were for tularemia but our approach may be adapted to other infectious diseases. The pilot synthetic background records were in the 4

  5. Data-Driven Learning of Speech Acts Based on Corpora of DVD Subtitles

    ERIC Educational Resources Information Center

    Kitao, S. Kathleen; Kitao, Kenji

    2013-01-01

    Data-driven learning (DDL) is an inductive approach to language learning in which students study examples of authentic language and use them to find patterns of language use. This inductive approach to learning has the advantages of being learner-centered, encouraging hypothesis testing and learner autonomy, and helping develop learning skills.…

  6. A Causal, Data-driven Approach to Modeling the Kepler Data

    NASA Astrophysics Data System (ADS)

    Wang, Dun; Hogg, David W.; Foreman-Mackey, Daniel; Schölkopf, Bernhard

    2016-09-01

    Astronomical observations are affected by several kinds of noise, each with its own causal source; there is photon noise, stochastic source variability, and residuals coming from imperfect calibration of the detector or telescope. The precision of NASA Kepler photometry for exoplanet science—the most precise photometric measurements of stars ever made—appears to be limited by unknown or untracked variations in spacecraft pointing and temperature, and unmodeled stellar variability. Here, we present the causal pixel model (CPM) for Kepler data, a data-driven model intended to capture variability but preserve transit signals. The CPM works at the pixel level so that it can capture very fine-grained information about the variation of the spacecraft. The CPM models the systematic effects in the time series of a pixel using the pixels of many other stars and the assumption that any shared signal in these causally disconnected light curves is caused by instrumental effects. In addition, we use the target star’s future and past (autoregression). By appropriately separating, for each data point, the data into training and test sets, we ensure that information about any transit will be perfectly isolated from the model. The method has four tuning parameters—the number of predictor stars or pixels, the autoregressive window size, and two L2-regularization amplitudes for model components, which we set by cross-validation. We determine values for tuning parameters that works well for most of the stars and apply the method to a corresponding set of target stars. We find that CPM can consistently produce low-noise light curves. In this paper, we demonstrate that pixel-level de-trending is possible while retaining transit signals, and we think that methods like CPM are generally applicable and might be useful for K2, TESS, etc., where the data are not clean postage stamps like Kepler.

  7. Data-centric method for object observation through scattering media

    NASA Astrophysics Data System (ADS)

    Tanida, Jun; Horisaki, Ryoichi

    2018-03-01

    A data-centric method is introduced for object observation through scattering media. A large number of training pairs are used to characterize the relation between the object and the observation signals based on machine learning. Using the method object information can be retrieved even from strongly-disturbed signals. As potential applications, object recognition, imaging, and focusing through scattering media were demonstrated.

  8. Toward a Literature-Driven Definition of Big Data in Healthcare

    PubMed Central

    Baro, Emilie; Degoul, Samuel; Beuscart, Régis; Chazard, Emmanuel

    2015-01-01

    Objective. The aim of this study was to provide a definition of big data in healthcare. Methods. A systematic search of PubMed literature published until May 9, 2014, was conducted. We noted the number of statistical individuals (n) and the number of variables (p) for all papers describing a dataset. These papers were classified into fields of study. Characteristics attributed to big data by authors were also considered. Based on this analysis, a definition of big data was proposed. Results. A total of 196 papers were included. Big data can be defined as datasets with Log⁡(n∗p) ≥ 7. Properties of big data are its great variety and high velocity. Big data raises challenges on veracity, on all aspects of the workflow, on extracting meaningful information, and on sharing information. Big data requires new computational methods that optimize data management. Related concepts are data reuse, false knowledge discovery, and privacy issues. Conclusion. Big data is defined by volume. Big data should not be confused with data reuse: data can be big without being reused for another purpose, for example, in omics. Inversely, data can be reused without being necessarily big, for example, secondary use of Electronic Medical Records (EMR) data. PMID:26137488

  9. ESDORA: A Data Archive Infrastructure Using Digital Object Model and Open Source Frameworks

    NASA Astrophysics Data System (ADS)

    Shrestha, Biva; Pan, Jerry; Green, Jim; Palanisamy, Giriprakash; Wei, Yaxing; Lenhardt, W.; Cook, R. Bob; Wilson, B. E.; Leggott, M.

    2011-12-01

    There are an array of challenges associated with preserving, managing, and using contemporary scientific data. Large volume, multiple formats and data services, and the lack of a coherent mechanism for metadata/data management are some of the common issues across data centers. It is often difficult to preserve the data history and lineage information, along with other descriptive metadata, hindering the true science value for the archived data products. In this project, we use digital object abstraction architecture as the information/knowledge framework to address these challenges. We have used the following open-source frameworks: Fedora-Commons Repository, Drupal Content Management System, Islandora (Drupal Module) and Apache Solr Search Engine. The system is an active archive infrastructure for Earth Science data resources, which include ingestion, archiving, distribution, and discovery functionalities. We use an ingestion workflow to ingest the data and metadata, where many different aspects of data descriptions (including structured and non-structured metadata) are reviewed. The data and metadata are published after reviewing multiple times. They are staged during the reviewing phase. Each digital object is encoded in XML for long-term preservation of the content and relations among the digital items. The software architecture provides a flexible, modularized framework for adding pluggable user-oriented functionality. Solr is used to enable word search as well as faceted search. A home grown spatial search module is plugged in to allow user to make a spatial selection in a map view. A RDF semantic store within the Fedora-Commons Repository is used for storing information on data lineage, dissemination services, and text-based metadata. We use the semantic notion "isViewerFor" to register internally or externally referenced URLs, which are rendered within the same web browser when possible. With appropriate mapping of content into digital objects, many

  10. Advancing data reuse in phyloinformatics using an ontology-driven Semantic Web approach.

    PubMed

    Panahiazar, Maryam; Sheth, Amit P; Ranabahu, Ajith; Vos, Rutger A; Leebens-Mack, Jim

    2013-01-01

    Phylogenetic analyses can resolve historical relationships among genes, organisms or higher taxa. Understanding such relationships can elucidate a wide range of biological phenomena, including, for example, the importance of gene and genome duplications in the evolution of gene function, the role of adaptation as a driver of diversification, or the evolutionary consequences of biogeographic shifts. Phyloinformaticists are developing data standards, databases and communication protocols (e.g. Application Programming Interfaces, APIs) to extend the accessibility of gene trees, species trees, and the metadata necessary to interpret these trees, thus enabling researchers across the life sciences to reuse phylogenetic knowledge. Specifically, Semantic Web technologies are being developed to make phylogenetic knowledge interpretable by web agents, thereby enabling intelligently automated, high-throughput reuse of results generated by phylogenetic research. This manuscript describes an ontology-driven, semantic problem-solving environment for phylogenetic analyses and introduces artefacts that can promote phyloinformatic efforts to promote accessibility of trees and underlying metadata. PhylOnt is an extensible ontology with concepts describing tree types and tree building methodologies including estimation methods, models and programs. In addition we present the PhylAnt platform for annotating scientific articles and NeXML files with PhylOnt concepts. The novelty of this work is the annotation of NeXML files and phylogenetic related documents with PhylOnt Ontology. This approach advances data reuse in phyloinformatics.

  11. The evolution of meaning: spatio-temporal dynamics of visual object recognition.

    PubMed

    Clarke, Alex; Taylor, Kirsten I; Tyler, Lorraine K

    2011-08-01

    Research on the spatio-temporal dynamics of visual object recognition suggests a recurrent, interactive model whereby an initial feedforward sweep through the ventral stream to prefrontal cortex is followed by recurrent interactions. However, critical questions remain regarding the factors that mediate the degree of recurrent interactions necessary for meaningful object recognition. The novel prediction we test here is that recurrent interactivity is driven by increasing semantic integration demands as defined by the complexity of semantic information required by the task and driven by the stimuli. To test this prediction, we recorded magnetoencephalography data while participants named living and nonliving objects during two naming tasks. We found that the spatio-temporal dynamics of neural activity were modulated by the level of semantic integration required. Specifically, source reconstructed time courses and phase synchronization measures showed increased recurrent interactions as a function of semantic integration demands. These findings demonstrate that the cortical dynamics of object processing are modulated by the complexity of semantic information required from the visual input.

  12. Data-Driven Learning for Beginners: The Case of German Verb-Preposition Collocations

    ERIC Educational Resources Information Center

    Vyatkina, Nina

    2016-01-01

    Research on data-driven learning (DDL), or teaching and learning languages with the help of electronic corpora, has shown that it is both effective and efficient. Nevertheless, DDL is still far from common pedagogical practice, not least because the empirical research on it is still limited and narrowly focused. This study addresses some gaps in…

  13. Data-driven advice for applying machine learning to bioinformatics problems

    PubMed Central

    Olson, Randal S.; La Cava, William; Mustahsan, Zairah; Varik, Akshay; Moore, Jason H.

    2017-01-01

    As the bioinformatics field grows, it must keep pace not only with new data but with new algorithms. Here we contribute a thorough analysis of 13 state-of-the-art, commonly used machine learning algorithms on a set of 165 publicly available classification problems in order to provide data-driven algorithm recommendations to current researchers. We present a number of statistical and visual comparisons of algorithm performance and quantify the effect of model selection and algorithm tuning for each algorithm and dataset. The analysis culminates in the recommendation of five algorithms with hyperparameters that maximize classifier performance across the tested problems, as well as general guidelines for applying machine learning to supervised classification problems. PMID:29218881

  14. Data-driven, Interpretable Photometric Redshifts Trained on Heterogeneous and Unrepresentative Data

    NASA Astrophysics Data System (ADS)

    Leistedt, Boris; Hogg, David W.

    2017-03-01

    We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux-redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training data or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the I-magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.

  15. Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP.

    PubMed

    Wei, Qinglai; Song, Ruizhuo; Yan, Pengfei

    2016-02-01

    This paper is concerned with a new data-driven zero-sum neuro-optimal control problem for continuous-time unknown nonlinear systems with disturbance. According to the input-output data of the nonlinear system, an effective recurrent neural network is introduced to reconstruct the dynamics of the nonlinear system. Considering the system disturbance as a control input, a two-player zero-sum optimal control problem is established. Adaptive dynamic programming (ADP) is developed to obtain the optimal control under the worst case of the disturbance. Three single-layer neural networks, including one critic and two action networks, are employed to approximate the performance index function, the optimal control law, and the disturbance, respectively, for facilitating the implementation of the ADP method. Convergence properties of the ADP method are developed to show that the system state will converge to a finite neighborhood of the equilibrium. The weight matrices of the critic and the two action networks are also convergent to finite neighborhoods of their optimal ones. Finally, the simulation results will show the effectiveness of the developed data-driven ADP methods.

  16. Proactive monitoring of an onshore wind farm through lidar measurements, SCADA data and a data-driven RANS solver

    NASA Astrophysics Data System (ADS)

    Iungo, Giacomo Valerio; Camarri, Simone; Ciri, Umberto; El-Asha, Said; Leonardi, Stefano; Rotea, Mario A.; Santhanagopalan, Vignesh; Viola, Francesco; Zhan, Lu

    2016-11-01

    Site conditions, such as topography and local climate, as well as wind farm layout strongly affect performance of a wind power plant. Therefore, predictions of wake interactions and their effects on power production still remain a great challenge in wind energy. For this study, an onshore wind turbine array was monitored through lidar measurements, SCADA and met-tower data. Power losses due to wake interactions were estimated to be approximately 4% and 2% of the total power production under stable and convective conditions, respectively. This dataset was then leveraged for the calibration of a data driven RANS (DDRANS) solver, which is a compelling tool for prediction of wind turbine wakes and power production. DDRANS is characterized by a computational cost as low as that for engineering wake models, and adequate accuracy achieved through data-driven tuning of the turbulence closure model. DDRANS is based on a parabolic formulation, axisymmetry and boundary layer approximations, which allow achieving low computational costs. The turbulence closure model consists in a mixing length model, which is optimally calibrated with the experimental dataset. Assessment of DDRANS is then performed through lidar and SCADA data for different atmospheric conditions. This material is based upon work supported by the National Science Foundation under the I/UCRC WindSTAR, NSF Award IIP 1362033.

  17. A Cyber Enabled Collaborative Environment for Creating, Sharing and Using Data and Modeling Driven Curriculum Modules for Hydrology Education

    NASA Astrophysics Data System (ADS)

    Merwade, V.; Ruddell, B. L.; Fox, S.; Iverson, E. A. R.

    2014-12-01

    With the access to emerging datasets and computational tools, there is a need to bring these capabilities into hydrology classrooms. However, developing curriculum modules using data and models to augment classroom teaching is hindered by a steep technology learning curve, rapid technology turnover, and lack of an organized community cyberinfrastructure (CI) for the dissemination, publication, and sharing of the latest tools and curriculum material for hydrology and geoscience education. The objective of this project is to overcome some of these limitations by developing a cyber enabled collaborative environment for publishing, sharing and adoption of data and modeling driven curriculum modules in hydrology and geosciences classroom. The CI is based on Carleton College's Science Education Resource Center (SERC) Content Management System. Building on its existing community authoring capabilities the system is being extended to allow assembly of new teaching activities by drawing on a collection of interchangeable building blocks; each of which represents a step in the modeling process. Currently the system hosts more than 30 modules or steps, which can be combined to create multiple learning units. Two specific units: Unit Hydrograph and Rational Method, have been used in undergraduate hydrology class-rooms at Purdue University and Arizona State University. The structure of the CI and the lessons learned from its implementation, including preliminary results from student assessments of learning will be presented.

  18. Ontology-driven data integration and visualization for exploring regional geologic time and paleontological information

    NASA Astrophysics Data System (ADS)

    Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo

    2018-06-01

    Initiatives of open data promote the online publication and sharing of large amounts of geologic data. How to retrieve information and discover knowledge from the big data is an ongoing challenge. In this paper, we developed an ontology-driven data integration and visualization pilot system for exploring information of regional geologic time, paleontology, and fundamental geology. The pilot system (http://www2.cs.uidaho.edu/%7Emax/gts/)

  19. Design and Data in Balance: Using Design-Driven Decision Making to Enable Student Success

    ERIC Educational Resources Information Center

    Fairchild, Susan; Farrell, Timothy; Gunton, Brad; Mackinnon, Anne; McNamara, Christina; Trachtman, Roberta

    2014-01-01

    Data-driven approaches to school decision making have come into widespread use in the past decade, nationally and in New York City. New Visions has been at the forefront of those developments: in New Visions schools, teacher teams and school teams regularly examine student performance data to understand patterns and drive classroom- and…

  20. Size matters: large objects capture attention in visual search.

    PubMed

    Proulx, Michael J

    2010-12-23

    Can objects or events ever capture one's attention in a purely stimulus-driven manner? A recent review of the literature set out the criteria required to find stimulus-driven attentional capture independent of goal-directed influences, and concluded that no published study has satisfied that criteria. Here visual search experiments assessed whether an irrelevantly large object can capture attention. Capture of attention by this static visual feature was found. The results suggest that a large object can indeed capture attention in a stimulus-driven manner and independent of displaywide features of the task that might encourage a goal-directed bias for large items. It is concluded that these results are either consistent with the stimulus-driven criteria published previously or alternatively consistent with a flexible, goal-directed mechanism of saliency detection.

  1. Radiation-driven winds of hot stars. VI - Analytical solutions for wind models including the finite cone angle effect

    NASA Technical Reports Server (NTRS)

    Kudritzki, R. P.; Pauldrach, A.; Puls, J.; Abbott, D. C.

    1989-01-01

    Analytical solutions for radiation-driven winds of hot stars including the important finite cone angle effect (see Pauldrach et al., 1986; Friend and Abbott, 1986) are derived which approximate the detailed numerical solutions of the exact wind equation of motion very well. They allow a detailed discussion of the finite cone angle effect and provide for given line force parameters k, alpha, delta definite formulas for mass-loss rate M and terminal velocity v-alpha as function of stellar parameters.

  2. A data-driven approach to identify controls on global fire activity from satellite and climate observations (SOFIA V1)

    NASA Astrophysics Data System (ADS)

    Forkel, Matthias; Dorigo, Wouter; Lasslop, Gitta; Teubner, Irene; Chuvieco, Emilio; Thonicke, Kirsten

    2017-12-01

    Vegetation fires affect human infrastructures, ecosystems, global vegetation distribution, and atmospheric composition. However, the climatic, environmental, and socioeconomic factors that control global fire activity in vegetation are only poorly understood, and in various complexities and formulations are represented in global process-oriented vegetation-fire models. Data-driven model approaches such as machine learning algorithms have successfully been used to identify and better understand controlling factors for fire activity. However, such machine learning models cannot be easily adapted or even implemented within process-oriented global vegetation-fire models. To overcome this gap between machine learning-based approaches and process-oriented global fire models, we introduce a new flexible data-driven fire modelling approach here (Satellite Observations to predict FIre Activity, SOFIA approach version 1). SOFIA models can use several predictor variables and functional relationships to estimate burned area that can be easily adapted with more complex process-oriented vegetation-fire models. We created an ensemble of SOFIA models to test the importance of several predictor variables. SOFIA models result in the highest performance in predicting burned area if they account for a direct restriction of fire activity under wet conditions and if they include a land cover-dependent restriction or allowance of fire activity by vegetation density and biomass. The use of vegetation optical depth data from microwave satellite observations, a proxy for vegetation biomass and water content, reaches higher model performance than commonly used vegetation variables from optical sensors. We further analyse spatial patterns of the sensitivity between anthropogenic, climate, and vegetation predictor variables and burned area. We finally discuss how multiple observational datasets on climate, hydrological, vegetation, and socioeconomic variables together with data-driven

  3. The Correlation of Data-Driven Instruction to Teacher-Effectiveness, Teacher-Collaboration, and Teacher-Satisfaction

    ERIC Educational Resources Information Center

    Shaw, Rhonda R.

    2017-01-01

    Education reform is inevitable; however, the journey of reform must ensure that educators are equipped to meet the diverse needs of all children within the classrooms throughout. Data-driven decision making is going to be the driving force for making that happen. This mixed model research was designed to show how implementing data-driven…

  4. The `Henry Problem' of `density-driven' groundwater flow versus Tothian `groundwater flow systems' with variable density: A review of the influential Biscayne aquifer data.

    NASA Astrophysics Data System (ADS)

    Weyer, K. U.

    2017-12-01

    Coastal groundwater flow investigations at the Biscayne Bay, south of Miami, Florida, gave rise to the concept of density-driven flow of seawater into coastal aquifers creating a saltwater wedge. Within that wedge, convection-driven return flow of seawater and a dispersion zone were assumed by Cooper et al. (1964) to be the cause of the Biscayne aquifer `sea water wedge'. This conclusion was based on the chloride distribution within the aquifer and on an analytical model concept assuming convection flow within a confined aquifer without taking non-chemical field data into consideration. This concept was later labelled the `Henry Problem', which any numerical variable density flow program must be able to simulate to be considered acceptable. Both, `density-driven flow' and Tothian `groundwater flow systems' (with or without variable density conditions) are driven by gravitation. The difference between the two are the boundary conditions. 'Density-driven flow' occurs under hydrostatic boundary conditions while Tothian `groundwater flow systems' occur under hydrodynamic boundary conditions. Revisiting the Cooper et al. (1964) publication with its record of piezometric field data (heads) showed that the so-called sea water wedge has been caused by discharging deep saline groundwater driven by gravitational flow and not by denser sea water. Density driven flow of seawater into the aquifer was not found reflected in the head measurements for low and high tide conditions which had been taken contemporaneously with the chloride measurements. These head measurements had not been included in the flow interpretation. The very same head measurements indicated a clear dividing line between shallow local fresh groundwater flow and saline deep groundwater flow without the existence of a dispersion zone or a convection cell. The Biscayne situation emphasizes the need for any chemical interpretation of flow pattern to be supported by head data as energy indicators of flow fields

  5. Data-Driven Based Asynchronous Motor Control for Printing Servo Systems

    NASA Astrophysics Data System (ADS)

    Bian, Min; Guo, Qingyun

    Modern digital printing equipment aims to the environmental-friendly industry with high dynamic performances and control precision and low vibration and abrasion. High performance motion control system of printing servo systems was required. Control system of asynchronous motor based on data acquisition was proposed. Iterative learning control (ILC) algorithm was studied. PID control was widely used in the motion control. However, it was sensitive to the disturbances and model parameters variation. The ILC applied the history error data and present control signals to approximate the control signal directly in order to fully track the expect trajectory without the system models and structures. The motor control algorithm based on the ILC and PID was constructed and simulation results were given. The results show that data-driven control method is effective dealing with bounded disturbances for the motion control of printing servo systems.

  6. X-ray astronomy in the laboratory with a miniature compact object produced by laser-driven implosion

    NASA Astrophysics Data System (ADS)

    Fujioka, Shinsuke; Takabe, Hideaki; Yamamoto, Norimasa; Salzmann, David; Wang, Feilu; Nishimura, Hiroaki; Li, Yutong; Dong, Quanli; Wang, Shoujun; Zhang, Yi; Rhee, Yong-Joo; Lee, Yong-Woo; Han, Jae-Min; Tanabe, Minoru; Fujiwara, Takashi; Nakabayashi, Yuto; Zhao, Gang; Zhang, Jie; Mima, Kunioki

    2009-11-01

    X-ray spectroscopy is an important tool for understanding the extreme photoionization processes that drive the behaviour of non-thermal equilibrium plasmas in compact astrophysical objects such as black holes. Even so, the distance of these objects from the Earth and the inability to control or accurately ascertain the conditions that govern their behaviour makes it difficult to interpret the origin of the features in astronomical X-ray measurements. Here, we describe an experiment that uses the implosion driven by a 3TW, 4kJ laser system to produce a 0.5keV blackbody radiator that mimics the conditions that exist in the neighbourhood of a black hole. The X-ray spectra emitted from photoionized silicon plasmas resemble those observed from the binary stars Cygnus X-3 (refs 7, 8) and Vela X-1 (refs 9, 10 11) with the Chandra X-ray satellite. As well as demonstrating the ability to create extreme radiation fields in a laboratory plasma, our theoretical interpretation of these laboratory spectra contrasts starkly with the generally accepted explanation for the origin of similar features in astronomical observations. Our experimental approach offers a powerful means to test and validate the computer codes used in X-ray astronomy.

  7. Wrapping SRS with CORBA: from textual data to distributed objects.

    PubMed

    Coupaye, T

    1999-04-01

    Biological data come in very different shapes. Databanks are maintained and used by distinct organizations. Text is the de facto Standard exchange format. The SRS system can integrate heterogeneous textual databanks but it was lacking a way to structure the extracted data. This paper presents a CORBA interface to the SRS system which manages databanks in a flat file format. SRS Object Servers are CORBA wrappers for SRS. They allow client applications (visualisation tools, data mining tools, etc.) to access and query SRS servers remotely through an Object Request Broker (ORB). They provide loader objects that contain the information extracted from the databanks by SRS. Loader objects are not hard-coded but generated in a flexible way by using loader specifications which allow SRS administrators to package data coming from distinct databanks. The prototype may be available for beta-testing. Please contact the SRS group (http://srs.ebi.ac.uk).

  8. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions.

    PubMed

    Hao, Ming; Bryant, Stephen H; Wang, Yanli

    2018-02-06

    While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred. Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US.

  9. Solving Partial Differential Equations in a data-driven multiprocessor environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.

    1988-12-31

    Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment.more » New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.« less

  10. Driven by Data: How Three Districts Are Successfully Using Data, Rather than Gut Feelings, to Align Staff Development with School Needs

    ERIC Educational Resources Information Center

    Gold, Stephanie

    2005-01-01

    The concept of data-driven professional development is both straight-forward and sensible. Implementing this approach is another story, which is why many administrators are turning to sophisticated tools to help manage data collection and analysis. These tools allow educators to assess and correlate student outcomes, instructional methods, and…

  11. Data driven models of the performance and repeatability of NIF high foot implosions

    NASA Astrophysics Data System (ADS)

    Gaffney, Jim; Casey, Dan; Callahan, Debbie; Hartouni, Ed; Ma, Tammy; Spears, Brian

    2015-11-01

    Recent high foot (HF) inertial confinement fusion (ICF) experiments performed at the national ignition facility (NIF) have consisted of enough laser shots that a data-driven analysis of capsule performance is feasible. In this work we use 20-30 individual implosions of similar design, spanning laser drive energies from 1.2 to 1.8 MJ, to quantify our current understanding of the behavior of HF ICF implosions. We develop a probabilistic model for the projected performance of a given implosion and use it to quantify uncertainties in predicted performance including shot-shot variations and observation uncertainties. We investigate the statistical significance of the observed performance differences between different laser pulse shapes, ablator materials, and capsule designs. Finally, using a cross-validation technique, we demonstrate that 5-10 repeated shots of a similar design are required before real trends in the data can be distinguished from shot-shot variations. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-674957.

  12. Cloudweaver: Adaptive and Data-Driven Workload Manager for Generic Clouds

    NASA Astrophysics Data System (ADS)

    Li, Rui; Chen, Lei; Li, Wen-Syan

    Cloud computing denotes the latest trend in application development for parallel computing on massive data volumes. It relies on clouds of servers to handle tasks that used to be managed by an individual server. With cloud computing, software vendors can provide business intelligence and data analytic services for internet scale data sets. Many open source projects, such as Hadoop, offer various software components that are essential for building a cloud infrastructure. Current Hadoop (and many others) requires users to configure cloud infrastructures via programs and APIs and such configuration is fixed during the runtime. In this chapter, we propose a workload manager (WLM), called CloudWeaver, which provides automated configuration of a cloud infrastructure for runtime execution. The workload management is data-driven and can adapt to dynamic nature of operator throughput during different execution phases. CloudWeaver works for a single job and a workload consisting of multiple jobs running concurrently, which aims at maximum throughput using a minimum set of processors.

  13. User-Preference-Driven Model Predictive Control of Residential Building Loads and Battery Storage for Demand Response

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Xin; Baker, Kyri A; Isley, Steven C

    This paper presents a user-preference-driven home energy management system (HEMS) for demand response (DR) with residential building loads and battery storage. The HEMS is based on a multi-objective model predictive control algorithm, where the objectives include energy cost, thermal comfort, and carbon emission. A multi-criterion decision making method originating from social science is used to quickly determine user preferences based on a brief survey and derive the weights of different objectives used in the optimization process. Besides the residential appliances used in the traditional DR programs, a home battery system is integrated into the HEMS to improve the flexibility andmore » reliability of the DR resources. Simulation studies have been performed on field data from a residential building stock data set. Appliance models and usage patterns were learned from the data to predict the DR resource availability. Results indicate the HEMS was able to provide a significant amount of load reduction with less than 20% prediction error in both heating and cooling cases.« less

  14. Automatic translation of MPI source into a latency-tolerant, data-driven form

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric

    Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety ofmore » applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.« less

  15. Automatic translation of MPI source into a latency-tolerant, data-driven form

    DOE PAGES

    Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric; ...

    2017-03-06

    Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety ofmore » applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.« less

  16. Automatic translation of MPI source into a latency-tolerant, data-driven form

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric

    Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. Bamboo reformulates MPI source into the form of a task dependency graph that expresses a partial ordering among tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotationmore » for a variety of applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo's performance meets or exceeds that of labor-intensive hand coding. The translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a wellknown library.« less

  17. Data-driven gradient algorithm for high-precision quantum control

    NASA Astrophysics Data System (ADS)

    Wu, Re-Bing; Chu, Bing; Owens, David H.; Rabitz, Herschel

    2018-04-01

    In the quest to achieve scalable quantum information processing technologies, gradient-based optimal control algorithms (e.g., grape) are broadly used for implementing high-precision quantum gates, but their performance is often hindered by deterministic or random errors in the system model and the control electronics. In this paper, we show that grape can be taught to be more effective by jointly learning from the design model and the experimental data obtained from process tomography. The resulting data-driven gradient optimization algorithm (d-grape) can in principle correct all deterministic gate errors, with a mild efficiency loss. The d-grape algorithm may become more powerful with broadband controls that involve a large number of control parameters, while other algorithms usually slow down due to the increased size of the search space. These advantages are demonstrated by simulating the implementation of a two-qubit controlled-not gate.

  18. ARCADIA: a system for the integration of angiocardiographic data and images by an object-oriented DBMS.

    PubMed

    Pinciroli, F; Combi, C; Pozzi, G

    1995-02-01

    Use of data base techniques to store medical records has been going on for more than 40 years. Some aspects still remain unresolved, e.g., the management of textual data and image data within a single system. Object-orientation techniques applied to a database management system (DBMS) allow the definition of suitable data structures (e.g., to store digital images): some facilities allow the use of predefined structures when defining new ones. Currently available object-oriented DBMS, however, still need improvements both in the schema update and in the query facilities. This paper describes a prototype of a medical record that includes some multimedia features, managing both textual and image data. The prototype here described considers data from the medical records of patients subjected to percutaneous transluminal coronary artery angioplasty. We developed it on a Sun workstation with a Unix operating system and ONTOS as an object-oriented DBMS.

  19. Modeling and query the uncertainty of network constrained moving objects based on RFID data

    NASA Astrophysics Data System (ADS)

    Han, Liang; Xie, Kunqing; Ma, Xiujun; Song, Guojie

    2007-06-01

    The management of network constrained moving objects is more and more practical, especially in intelligent transportation system. In the past, the location information of moving objects on network is collected by GPS, which cost high and has the problem of frequent update and privacy. The RFID (Radio Frequency IDentification) devices are used more and more widely to collect the location information. They are cheaper and have less update. And they interfere in the privacy less. They detect the id of the object and the time when moving object passed by the node of the network. They don't detect the objects' exact movement in side the edge, which lead to a problem of uncertainty. How to modeling and query the uncertainty of the network constrained moving objects based on RFID data becomes a research issue. In this paper, a model is proposed to describe the uncertainty of network constrained moving objects. A two level index is presented to provide efficient access to the network and the data of movement. The processing of imprecise time-slice query and spatio-temporal range query are studied in this paper. The processing includes four steps: spatial filter, spatial refinement, temporal filter and probability calculation. Finally, some experiments are done based on the simulated data. In the experiments the performance of the index is studied. The precision and recall of the result set are defined. And how the query arguments affect the precision and recall of the result set is also discussed.

  20. Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks.

    PubMed

    Vlachas, Pantelis R; Byeon, Wonmin; Wan, Zhong Y; Sapsis, Themistoklis P; Koumoutsakos, Petros

    2018-05-01

    We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.

  1. Data-Driven Software Framework for Web-Based ISS Telescience

    NASA Technical Reports Server (NTRS)

    Tso, Kam S.

    2005-01-01

    Software that enables authorized users to monitor and control scientific payloads aboard the International Space Station (ISS) from diverse terrestrial locations equipped with Internet connections is undergoing development. This software reflects a data-driven approach to distributed operations. A Web-based software framework leverages prior developments in Java and Extensible Markup Language (XML) to create portable code and portable data, to which one can gain access via Web-browser software on almost any common computer. Open-source software is used extensively to minimize cost; the framework also accommodates enterprise-class server software to satisfy needs for high performance and security. To accommodate the diversity of ISS experiments and users, the framework emphasizes openness and extensibility. Users can take advantage of available viewer software to create their own client programs according to their particular preferences, and can upload these programs for custom processing of data, generation of views, and planning of experiments. The same software system, possibly augmented with a subset of data and additional software tools, could be used for public outreach by enabling public users to replay telescience experiments, conduct their experiments with simulated payloads, and create their own client programs and other custom software.

  2. Data-Driven Information Extraction from Chinese Electronic Medical Records.

    PubMed

    Xu, Dong; Zhang, Meizhuo; Zhao, Tianwan; Ge, Chen; Gao, Weiguo; Wei, Jia; Zhu, Kenny Q

    2015-01-01

    This study aims to propose a data-driven framework that takes unstructured free text narratives in Chinese Electronic Medical Records (EMRs) as input and converts them into structured time-event-description triples, where the description is either an elaboration or an outcome of the medical event. Our framework uses a hybrid approach. It consists of constructing cross-domain core medical lexica, an unsupervised, iterative algorithm to accrue more accurate terms into the lexica, rules to address Chinese writing conventions and temporal descriptors, and a Support Vector Machine (SVM) algorithm that innovatively utilizes Normalized Google Distance (NGD) to estimate the correlation between medical events and their descriptions. The effectiveness of the framework was demonstrated with a dataset of 24,817 de-identified Chinese EMRs. The cross-domain medical lexica were capable of recognizing terms with an F1-score of 0.896. 98.5% of recorded medical events were linked to temporal descriptors. The NGD SVM description-event matching achieved an F1-score of 0.874. The end-to-end time-event-description extraction of our framework achieved an F1-score of 0.846. In terms of named entity recognition, the proposed framework outperforms state-of-the-art supervised learning algorithms (F1-score: 0.896 vs. 0.886). In event-description association, the NGD SVM is superior to SVM using only local context and semantic features (F1-score: 0.874 vs. 0.838). The framework is data-driven, weakly supervised, and robust against the variations and noises that tend to occur in a large corpus. It addresses Chinese medical writing conventions and variations in writing styles through patterns used for discovering new terms and rules for updating the lexica.

  3. Individualized Prediction of Heat Stress in Firefighters: A Data-Driven Approach Using Classification and Regression Trees.

    PubMed

    Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit

    2015-01-01

    The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.

  4. Representing uncertainty in objective functions: extension to include the influence of serial correlation

    NASA Astrophysics Data System (ADS)

    Croke, B. F.

    2008-12-01

    The role of performance indicators is to give an accurate indication of the fit between a model and the system being modelled. As all measurements have an associated uncertainty (determining the significance that should be given to the measurement), performance indicators should take into account uncertainties in the observed quantities being modelled as well as in the model predictions (due to uncertainties in inputs, model parameters and model structure). In the presence of significant uncertainty in observed and modelled output of a system, failure to adequately account for variations in the uncertainties means that the objective function only gives a measure of how well the model fits the observations, not how well the model fits the system being modelled. Since in most cases, the interest lies in fitting the system response, it is vital that the objective function(s) be designed to account for these uncertainties. Most objective functions (e.g. those based on the sum of squared residuals) assume homoscedastic uncertainties. If model contribution to the variations in residuals can be ignored, then transformations (e.g. Box-Cox) can be used to remove (or at least significantly reduce) heteroscedasticity. An alternative which is more generally applicable is to explicitly represent the uncertainties in the observed and modelled values in the objective function. Previous work on this topic addressed the modifications to standard objective functions (Nash-Sutcliffe efficiency, RMSE, chi- squared, coefficient of determination) using the optimal weighted averaging approach. This paper extends this previous work; addressing the issue of serial correlation. A form for an objective function that includes serial correlation will be presented, and the impact on model fit discussed.

  5. Data-driven mapping of hypoxia-related tumor heterogeneity using DCE-MRI and OE-MRI.

    PubMed

    Featherstone, Adam K; O'Connor, James P B; Little, Ross A; Watson, Yvonne; Cheung, Sue; Babur, Muhammad; Williams, Kaye J; Matthews, Julian C; Parker, Geoff J M

    2018-04-01

    Previous work has shown that combining dynamic contrast-enhanced (DCE)-MRI and oxygen-enhanced (OE)-MRI binary enhancement maps can identify tumor hypoxia. The current work proposes a novel, data-driven method for mapping tissue oxygenation and perfusion heterogeneity, based on clustering DCE/OE-MRI data. DCE-MRI and OE-MRI were performed on nine U87 (glioblastoma) and seven Calu6 (non-small cell lung cancer) murine xenograft tumors. Area under the curve and principal component analysis features were calculated and clustered separately using Gaussian mixture modelling. Evaluation metrics were calculated to determine the optimum feature set and cluster number. Outputs were quantitatively compared with a previous non data-driven approach. The optimum method located six robustly identifiable clusters in the data, yielding tumor region maps with spatially contiguous regions in a rim-core structure, suggesting a biological basis. Mean within-cluster enhancement curves showed physiologically distinct, intuitive kinetics of enhancement. Regions of DCE/OE-MRI enhancement mismatch were located, and voxel categorization agreed well with the previous non data-driven approach (Cohen's kappa = 0.61, proportional agreement = 0.75). The proposed method locates similar regions to the previous published method of binarization of DCE/OE-MRI enhancement, but renders a finer segmentation of intra-tumoral oxygenation and perfusion. This could aid in understanding the tumor microenvironment and its heterogeneity. Magn Reson Med 79:2236-2245, 2018. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley

  6. Data-driven Model of the ICME Propagation through the Solar Corona and Inner Heliosphere

    NASA Astrophysics Data System (ADS)

    Yalim, M. S.; Pogorelov, N.; Singh, T.; Liu, Y.

    2017-12-01

    The solar wind (SW) emerging from the Sun is the main driving mechanism of solar events which may lead to geomagnetic storms that are the primary causes of space weather disturbances that affect the magnetic environment of Earth and may have hazardous effects on the space-borne and ground-based technological systems as well as human health. Therefore, accurate modeling of the SW is very important to understand the underlying mechanisms of such storms.Getting ready for the Parker Solar Probe mission, we have developed a data-driven magnetohydrodynamic (MHD) model of the global solar corona which utilizes characteristic boundary conditions implemented within the Multi-Scale Fluid-Kinetic Simulation Suite (MS-FLUKSS) - a collection of problem oriented routines incorporated into the Chombo adaptive mesh refinement framework developed at Lawrence Berkeley National Laboratory. Our global solar corona model can be driven by both synoptic and synchronic vector magnetogram data obtained by the Solar Dynamics Observatory/Helioseismic and Magnetic Imager (SDO/HMI) and the horizontal velocity data on the photosphere obtained by applying the Differential Affine Velocity Estimatorfor Vector Magnetograms (DAVE4VM) method on the HMI-observed vector magnetic fields.Our CME generation model is based on Gibson-Low-type flux ropes the parameters of which are determined from analysis of observational data from STEREO/SECCHI, SDO/AIA and SOHO/LASCO, and by applying the Graduate Cylindrical Shell model for the flux rope reconstruction.In this study, we will present the results of three-dimensional global simulations of ICME propagation through our characteristically-consistent MHD model of the background SW from the Sun to Earth driven by HMI-observed vector magnetic fields and validate our results using multiple spacecraft data at 1 AU.

  7. Federal Policy to Local Level Decision-Making: Data Driven Education Planning in Nigeria

    ERIC Educational Resources Information Center

    Iyengar, Radhika; Mahal, Angelique R.; Felicia, Ukaegbu-Nnamchi Ifeyinwa; Aliyu, Balaraba; Karim, Alia

    2015-01-01

    This article discusses the implementation of local level education data-driven planning as implemented by the Office of the Senior Special Assistant to the President of Nigeria on the Millennium Development Goals (OSSAP-MDGs) in partnership with The Earth Institute, Columbia University. It focuses on the design and implementation of the…

  8. Information-Theoretical Quantifier of Brain Rhythm Based on Data-Driven Multiscale Representation

    PubMed Central

    2015-01-01

    This paper presents a data-driven multiscale entropy measure to reveal the scale dependent information quantity of electroencephalogram (EEG) recordings. This work is motivated by the previous observations on the nonlinear and nonstationary nature of EEG over multiple time scales. Here, a new framework of entropy measures considering changing dynamics over multiple oscillatory scales is presented. First, to deal with nonstationarity over multiple scales, EEG recording is decomposed by applying the empirical mode decomposition (EMD) which is known to be effective for extracting the constituent narrowband components without a predetermined basis. Following calculation of Renyi entropy of the probability distributions of the intrinsic mode functions extracted by EMD leads to a data-driven multiscale Renyi entropy. To validate the performance of the proposed entropy measure, actual EEG recordings from rats (n = 9) experiencing 7 min cardiac arrest followed by resuscitation were analyzed. Simulation and experimental results demonstrate that the use of the multiscale Renyi entropy leads to better discriminative capability of the injury levels and improved correlations with the neurological deficit evaluation after 72 hours after cardiac arrest, thus suggesting an effective diagnostic and prognostic tool. PMID:26380297

  9. Testing the Accuracy of Data-driven MHD Simulations of Active Region Evolution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leake, James E.; Linton, Mark G.; Schuck, Peter W., E-mail: james.e.leake@nasa.gov

    Models for the evolution of the solar coronal magnetic field are vital for understanding solar activity, yet the best measurements of the magnetic field lie at the photosphere, necessitating the development of coronal models which are “data-driven” at the photosphere. We present an investigation to determine the feasibility and accuracy of such methods. Our validation framework uses a simulation of active region (AR) formation, modeling the emergence of magnetic flux from the convection zone to the corona, as a ground-truth data set, to supply both the photospheric information and to perform the validation of the data-driven method. We focus ourmore » investigation on how the accuracy of the data-driven model depends on the temporal frequency of the driving data. The Helioseismic and Magnetic Imager on NASA’s Solar Dynamics Observatory produces full-disk vector magnetic field measurements at a 12-minute cadence. Using our framework we show that ARs that emerge over 25 hr can be modeled by the data-driving method with only ∼1% error in the free magnetic energy, assuming the photospheric information is specified every 12 minutes. However, for rapidly evolving features, under-sampling of the dynamics at this cadence leads to a strobe effect, generating large electric currents and incorrect coronal morphology and energies. We derive a sampling condition for the driving cadence based on the evolution of these small-scale features, and show that higher-cadence driving can lead to acceptable errors. Future work will investigate the source of errors associated with deriving plasma variables from the photospheric magnetograms as well as other sources of errors, such as reduced resolution, instrument bias, and noise.« less

  10. Data-driven risk identification in phase III clinical trials using central statistical monitoring.

    PubMed

    Timmermans, Catherine; Venet, David; Burzykowski, Tomasz

    2016-02-01

    Our interest lies in quality control for clinical trials, in the context of risk-based monitoring (RBM). We specifically study the use of central statistical monitoring (CSM) to support RBM. Under an RBM paradigm, we claim that CSM has a key role to play in identifying the "risks to the most critical data elements and processes" that will drive targeted oversight. In order to support this claim, we first see how to characterize the risks that may affect clinical trials. We then discuss how CSM can be understood as a tool for providing a set of data-driven key risk indicators (KRIs), which help to organize adaptive targeted monitoring. Several case studies are provided where issues in a clinical trial have been identified thanks to targeted investigation after the identification of a risk using CSM. Using CSM to build data-driven KRIs helps to identify different kinds of issues in clinical trials. This ability is directly linked with the exhaustiveness of the CSM approach and its flexibility in the definition of the risks that are searched for when identifying the KRIs. In practice, a CSM assessment of the clinical database seems essential to ensure data quality. The atypical data patterns found in some centers and variables are seen as KRIs under a RBM approach. Targeted monitoring or data management queries can be used to confirm whether the KRIs point to an actual issue or not.

  11. A data driven method for estimation of B(avail) and appK(D) using a single injection protocol with [¹¹C]raclopride in the mouse.

    PubMed

    Wimberley, Catriona J; Fischer, Kristina; Reilhac, Anthonin; Pichler, Bernd J; Gregoire, Marie Claude

    2014-10-01

    The partial saturation approach (PSA) is a simple, single injection experimental protocol that will estimate both B(avail) and appK(D) without the use of blood sampling. This makes it ideal for use in longitudinal studies of neurodegenerative diseases in the rodent. The aim of this study was to increase the range and applicability of the PSA by developing a data driven strategy for determining reliable regional estimates of receptor density (B(avail)) and in vivo affinity (1/appK(D)), and validate the strategy using a simulation model. The data driven method uses a time window guided by the dynamic equilibrium state of the system as opposed to using a static time window. To test the method, simulations of partial saturation experiments were generated and validated against experimental data. The experimental conditions simulated included a range of receptor occupancy levels and three different B(avail) and appK(D) values to mimic diseases states. Also the effect of using a reference region and typical PET noise on the stability and accuracy of the estimates was investigated. The investigations showed that the parameter estimates in a simulated healthy mouse, using the data driven method were within 10±30% of the simulated input for the range of occupancy levels simulated. Throughout all experimental conditions simulated, the accuracy and robustness of the estimates using the data driven method were much improved upon the typical method of using a static time window, especially at low receptor occupancy levels. Introducing a reference region caused a bias of approximately 10% over the range of occupancy levels. Based on extensive simulated experimental conditions, it was shown the data driven method provides accurate and precise estimates of B(avail) and appK(D) for a broader range of conditions compared to the original method. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Social comparison modulates reward-driven attentional capture.

    PubMed

    Jiao, Jun; Du, Feng; He, Xiaosong; Zhang, Kan

    2015-10-01

    It is well established that attention can be captured by task irrelevant and non-salient objects associated with value through reward learning. However, it is unknown whether social comparison influences reward-driven attentional capture. The present study created four social contexts to examine whether different social comparisons modulate the reward-driven capture of attention. The results showed that reward-driven attentional capture varied with different social comparison conditions. Most prominently, reward-driven attentional capture is dramatically reduced in the disadvantageous social comparison context, in which an individual is informed that the other participant is earning more monetary reward for performing the same task. These findings suggest that social comparison can affect the reward-driven capture of attention.

  13. Education as a Data-Driven Enterprise: A Primer for Leaders in Business, Philanthropy, and Education

    ERIC Educational Resources Information Center

    Alliance for Excellent Education, 2011

    2011-01-01

    With advances in research, technology, and assessments, and with a focused effort, the U.S. education system can lead the world in becoming a data-driven enterprise. This publication provides leaders from business, philanthropy, and education with background on data issues; describes challenges that must be overcome; and makes recommendations for…

  14. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method.

    PubMed

    Zhang, Huaguang; Cui, Lili; Zhang, Xin; Luo, Yanhong

    2011-12-01

    In this paper, a novel data-driven robust approximate optimal tracking control scheme is proposed for unknown general nonlinear systems by using the adaptive dynamic programming (ADP) method. In the design of the controller, only available input-output data is required instead of known system dynamics. A data-driven model is established by a recurrent neural network (NN) to reconstruct the unknown system dynamics using available input-output data. By adding a novel adjustable term related to the modeling error, the resultant modeling error is first guaranteed to converge to zero. Then, based on the obtained data-driven model, the ADP method is utilized to design the approximate optimal tracking controller, which consists of the steady-state controller and the optimal feedback controller. Further, a robustifying term is developed to compensate for the NN approximation errors introduced by implementing the ADP method. Based on Lyapunov approach, stability analysis of the closed-loop system is performed to show that the proposed controller guarantees the system state asymptotically tracking the desired trajectory. Additionally, the obtained control input is proven to be close to the optimal control input within a small bound. Finally, two numerical examples are used to demonstrate the effectiveness of the proposed control scheme.

  15. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  16. Strategies for concurrent processing of complex algorithms in data driven architectures

    NASA Technical Reports Server (NTRS)

    Stoughton, John W.; Mielke, Roland R.

    1988-01-01

    The purpose is to document research to develop strategies for concurrent processing of complex algorithms in data driven architectures. The problem domain consists of decision-free algorithms having large-grained, computationally complex primitive operations. Such are often found in signal processing and control applications. The anticipated multiprocessor environment is a data flow architecture containing between two and twenty computing elements. Each computing element is a processor having local program memory, and which communicates with a common global data memory. A new graph theoretic model called ATAMM which establishes rules for relating a decomposed algorithm to its execution in a data flow architecture is presented. The ATAMM model is used to determine strategies to achieve optimum time performance and to develop a system diagnostic software tool. In addition, preliminary work on a new multiprocessor operating system based on the ATAMM specifications is described.

  17. A data-driven multiplicative fault diagnosis approach for automation processes.

    PubMed

    Hao, Haiyang; Zhang, Kai; Ding, Steven X; Chen, Zhiwen; Lei, Yaguo

    2014-09-01

    This paper presents a new data-driven method for diagnosing multiplicative key performance degradation in automation processes. Different from the well-established additive fault diagnosis approaches, the proposed method aims at identifying those low-level components which increase the variability of process variables and cause performance degradation. Based on process data, features of multiplicative fault are extracted. To identify the root cause, the impact of fault on each process variable is evaluated in the sense of contribution to performance degradation. Then, a numerical example is used to illustrate the functionalities of the method and Monte-Carlo simulation is performed to demonstrate the effectiveness from the statistical viewpoint. Finally, to show the practical applicability, a case study on the Tennessee Eastman process is presented. Copyright © 2013. Published by Elsevier Ltd.

  18. On the data-driven inference of modulatory networks in climate science: an application to West African rainfall

    NASA Astrophysics Data System (ADS)

    González, D. L., II; Angus, M. P.; Tetteh, I. K.; Bello, G. A.; Padmanabhan, K.; Pendse, S. V.; Srinivas, S.; Yu, J.; Semazzi, F.; Kumar, V.; Samatova, N. F.

    2014-04-01

    Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression, and Dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall, including well-known associations from prior climate knowledge, as well as promising discoveries that invite further research by the climate science community.

  19. User-Preference-Driven Model Predictive Control of Residential Building Loads and Battery Storage for Demand Response: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, Xin; Baker, Kyri A.; Christensen, Dane T.

    This paper presents a user-preference-driven home energy management system (HEMS) for demand response (DR) with residential building loads and battery storage. The HEMS is based on a multi-objective model predictive control algorithm, where the objectives include energy cost, thermal comfort, and carbon emission. A multi-criterion decision making method originating from social science is used to quickly determine user preferences based on a brief survey and derive the weights of different objectives used in the optimization process. Besides the residential appliances used in the traditional DR programs, a home battery system is integrated into the HEMS to improve the flexibility andmore » reliability of the DR resources. Simulation studies have been performed on field data from a residential building stock data set. Appliance models and usage patterns were learned from the data to predict the DR resource availability. Results indicate the HEMS was able to provide a significant amount of load reduction with less than 20% prediction error in both heating and cooling cases.« less

  20. Key Design Elements of a Data Utility for National Biosurveillance: Event-driven Architecture, Caching, and Web Service Model

    PubMed Central

    Tsui, Fu-Chiang; Espino, Jeremy U.; Weng, Yan; Choudary, Arvinder; Su, Hoah-Der; Wagner, Michael M.

    2005-01-01

    The National Retail Data Monitor (NRDM) has monitored over-the-counter (OTC) medication sales in the United States since December 2002. The NRDM collects data from over 18,600 retail stores and processes over 0.6 million sales records per day. This paper describes key architectural features that we have found necessary for a data utility component in a national biosurveillance system. These elements include event-driven architecture to provide analyses of data in near real time, multiple levels of caching to improve query response time, high availability through the use of clustered servers, scalable data storage through the use of storage area networks and a web-service function for interoperation with affiliated systems. The methods and architectural principles are relevant to the design of any production data utility for public health surveillance—systems that collect data from multiple sources in near real time for use by analytic programs and user interfaces that have substantial requirements for time-series data aggregated in multiple dimensions. PMID:16779138

  1. Key design elements of a data utility for national biosurveillance: event-driven architecture, caching, and Web service model.

    PubMed

    Tsui, Fu-Chiang; Espino, Jeremy U; Weng, Yan; Choudary, Arvinder; Su, Hoah-Der; Wagner, Michael M

    2005-01-01

    The National Retail Data Monitor (NRDM) has monitored over-the-counter (OTC) medication sales in the United States since December 2002. The NRDM collects data from over 18,600 retail stores and processes over 0.6 million sales records per day. This paper describes key architectural features that we have found necessary for a data utility component in a national biosurveillance system. These elements include event-driven architecture to provide analyses of data in near real time, multiple levels of caching to improve query response time, high availability through the use of clustered servers, scalable data storage through the use of storage area networks and a web-service function for interoperation with affiliated systems. The methods and architectural principles are relevant to the design of any production data utility for public health surveillance-systems that collect data from multiple sources in near real time for use by analytic programs and user interfaces that have substantial requirements for time-series data aggregated in multiple dimensions.

  2. The ``Missing Compounds'' affair in functionality-driven material discovery

    NASA Astrophysics Data System (ADS)

    Zunger, Alex

    2014-03-01

    In the paradigm of ``data-driven discovery,'' underlying one of the leading streams of the Material Genome Initiative (MGI), one attempts to compute high-throughput style as many of the properties of as many of the N (about 10**5- 10**6) compounds listed in databases of previously known compounds. One then inspects the ensuing Big Data, searching for useful trends. The alternative and complimentary paradigm of ``functionality-directed search and optimization'' used here, searches instead for the n much smaller than N configurations and compositions that have the desired value of the target functionality. Examples include the use of genetic and other search methods that optimize the structure or identity of atoms on lattice sites, using atomistic electronic structure (such as first-principles) approaches in search of a given electronic property. This addresses a few of the bottlenecks that have faced the alternative, data-driven/high throughput/Big Data philosophy: (i) When the configuration space is theoretically of infinite size, building a complete data base as in data-driven discovery is impossible, yet searching for the optimum functionality, is still a well-posed problem. (ii) The configuration space that we explore might include artificially grown, kinetically stabilized systems (such as 2D layer stacks; superlattices; colloidal nanostructures; Fullerenes) that are not listed in compound databases (used by data-driven approaches), (iii) a large fraction of chemically plausible compounds have not been experimentally synthesized, so in the data-driven approach these are often skipped. In our approach we search explicitly for such ``Missing Compounds''. It is likely that many interesting material properties will be found in cases (i)-(iii) that elude high throughput searches based on databases encapsulating existing knowledge. I will illustrate (a) Functionality-driven discovery of topological insulators and valley-split quantum-computer semiconductors, as well

  3. Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems.

    PubMed

    Lam, Maggie P Y; Venkatraman, Vidya; Xing, Yi; Lau, Edward; Cao, Quan; Ng, Dominic C M; Su, Andrew I; Ge, Junbo; Van Eyk, Jennifer E; Ping, Peipei

    2016-11-04

    Amidst the proteomes of human tissues lie subsets of proteins that are closely involved in conserved pathophysiological processes. Much of biomedical research concerns interrogating disease signature proteins and defining their roles in disease mechanisms. With advances in proteomics technologies, it is now feasible to develop targeted proteomics assays that can accurately quantify protein abundance as well as their post-translational modifications; however, with rapidly accumulating number of studies implicating proteins in diseases, current resources are insufficient to target every protein without judiciously prioritizing the proteins with high significance and impact for assay development. We describe here a data science method to prioritize and expedite assay development on high-impact proteins across research fields by leveraging the biomedical literature record to rank and normalize proteins that are popularly and preferentially published by biomedical researchers. We demonstrate this method by finding priority proteins across six major physiological systems (cardiovascular, cerebral, hepatic, renal, pulmonary, and intestinal). The described method is data-driven and builds upon the collective knowledge of previous publications referenced on PubMed to lend objectivity to target selection. The method and resulting popular protein lists may also be useful for exploring biological processes associated with various physiological systems and research topics, in addition to benefiting ongoing efforts to facilitate the broad translation of proteomics technologies.

  4. Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats

    NASA Astrophysics Data System (ADS)

    Ghiringhelli, Luca M.; Carbogno, Christian; Levchenko, Sergey; Mohamed, Fawzi; Huhs, Georg; Lüders, Martin; Oliveira, Micael; Scheffler, Matthias

    2017-11-01

    With big-data driven materials research, the new paradigm of materials science, sharing and wide accessibility of data are becoming crucial aspects. Obviously, a prerequisite for data exchange and big-data analytics is standardization, which means using consistent and unique conventions for, e.g., units, zero base lines, and file formats. There are two main strategies to achieve this goal. One accepts the heterogeneous nature of the community, which comprises scientists from physics, chemistry, bio-physics, and materials science, by complying with the diverse ecosystem of computer codes and thus develops "converters" for the input and output files of all important codes. These converters then translate the data of each code into a standardized, code-independent format. The other strategy is to provide standardized open libraries that code developers can adopt for shaping their inputs, outputs, and restart files, directly into the same code-independent format. In this perspective paper, we present both strategies and argue that they can and should be regarded as complementary, if not even synergetic. The represented appropriate format and conventions were agreed upon by two teams, the Electronic Structure Library (ESL) of the European Center for Atomic and Molecular Computations (CECAM) and the NOvel MAterials Discovery (NOMAD) Laboratory, a European Centre of Excellence (CoE). A key element of this work is the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations.

  5. Numerical Investigations of Capabilities and Limits of Photospheric Data Driven Magnetic Flux Emergence

    NASA Astrophysics Data System (ADS)

    Linton, M.; Leake, J. E.; Schuck, P. W.

    2016-12-01

    The magnetic field of the solar atmosphere is the primary driver of solar activity. Understanding the magnetic state of the solar atmosphere is therefore of key importance to predicting solar activity. One promising means of studying the magnetic atmosphere is to dynamically build up and evolve this atmosphere from the time evolution of emerging magnetic field at the photosphere, where it can be measured with current solar vector magnetograms at high temporal and spatial resolution. We report here on a series of numerical experiments investigating the capabilities and limits of magnetohydrodynamical simulations of such a process, where a magnetic corona is dynamically built up and evolved from a time series of synthetic photospheric data. These synthetic data are composed of photospheric slices taken from self consistent convection zone to corona simulations of flux emergence. The driven coronae are then quantitatively compared against the coronae of the original simulations. We investigate and report on the fidelity of these driven simulations, both as a function of the emergence timescale of the magnetic flux, and as a function of the driving cadence of the input data. These investigations will then be used to outline future prospects and challenges for using observed photospheric data to drive such solar atmospheric simulations. This work was supported by the Chief of Naval Research and the NASA Living with a Star and Heliophysics Supporting Research programs.

  6. Data-Driven Astrochemistry: One Step Further within the Origin of Life Puzzle.

    PubMed

    Ruf, Alexander; d'Hendecourt, Louis L S; Schmitt-Kopplin, Philippe

    2018-06-01

    Astrochemistry, meteoritics and chemical analytics represent a manifold scientific field, including various disciplines. In this review, clarifications on astrochemistry, comet chemistry, laboratory astrophysics and meteoritic research with respect to organic and metalorganic chemistry will be given. The seemingly large number of observed astrochemical molecules necessarily requires explanations on molecular complexity and chemical evolution, which will be discussed. Special emphasis should be placed on data-driven analytical methods including ultrahigh-resolving instruments and their interplay with quantum chemical computations. These methods enable remarkable insights into the complex chemical spaces that exist in meteorites and maximize the level of information on the huge astrochemical molecular diversity. In addition, they allow one to study even yet undescribed chemistry as the one involving organomagnesium compounds in meteorites. Both targeted and non-targeted analytical strategies will be explained and may touch upon epistemological problems. In addition, implications of (metal)organic matter toward prebiotic chemistry leading to the emergence of life will be discussed. The precise description of astrochemical organic and metalorganic matter as seeds for life and their interactions within various astrophysical environments may appear essential to further study questions regarding the emergence of life on a most fundamental level that is within the molecular world and its self-organization properties.

  7. Data driven modeling of plastic deformation

    DOE PAGES

    Versino, Daniele; Tonda, Alberto; Bronkhorst, Curt A.

    2017-05-01

    In this paper the application of machine learning techniques for the development of constitutive material models is being investigated. A flow stress model, for strain rates ranging from 10 –4 to 10 12 (quasi-static to highly dynamic), and temperatures ranging from room temperature to over 1000 K, is obtained by beginning directly with experimental stress-strain data for Copper. An incrementally objective and fully implicit time integration scheme is employed to integrate the hypo-elastic constitutive model, which is then implemented into a finite element code for evaluation. Accuracy and performance of the flow stress models derived from symbolic regression are assessedmore » by comparison to Taylor anvil impact data. The results obtained with the free-form constitutive material model are compared to well-established strength models such as the Preston-Tonks-Wallace (PTW) model and the Mechanical Threshold Stress (MTS) model. Here, preliminary results show candidate free-form models comparing well with data in regions of stress-strain space with sufficient experimental data, pointing to a potential means for both rapid prototyping in future model development, as well as the use of machine learning in capturing more data as a guide for more advanced model development.« less

  8. Data driven modeling of plastic deformation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Versino, Daniele; Tonda, Alberto; Bronkhorst, Curt A.

    In this paper the application of machine learning techniques for the development of constitutive material models is being investigated. A flow stress model, for strain rates ranging from 10 –4 to 10 12 (quasi-static to highly dynamic), and temperatures ranging from room temperature to over 1000 K, is obtained by beginning directly with experimental stress-strain data for Copper. An incrementally objective and fully implicit time integration scheme is employed to integrate the hypo-elastic constitutive model, which is then implemented into a finite element code for evaluation. Accuracy and performance of the flow stress models derived from symbolic regression are assessedmore » by comparison to Taylor anvil impact data. The results obtained with the free-form constitutive material model are compared to well-established strength models such as the Preston-Tonks-Wallace (PTW) model and the Mechanical Threshold Stress (MTS) model. Here, preliminary results show candidate free-form models comparing well with data in regions of stress-strain space with sufficient experimental data, pointing to a potential means for both rapid prototyping in future model development, as well as the use of machine learning in capturing more data as a guide for more advanced model development.« less

  9. Power-Law Modeling of Cancer Cell Fates Driven by Signaling Data to Reveal Drug Effects

    PubMed Central

    Zhang, Fan; Wu, Min; Kwoh, Chee Keong; Zheng, Jie

    2016-01-01

    Extracellular signals are captured and transmitted by signaling proteins inside a cell. An important type of cellular responses to the signals is the cell fate decision, e.g., apoptosis. However, the underlying mechanisms of cell fate regulation are still unclear, thus comprehensive and detailed kinetic models are not yet available. Alternatively, data-driven models are promising to bridge signaling data with the phenotypic measurements of cell fates. The traditional linear model for data-driven modeling of signaling pathways has its limitations because it assumes that the a cell fate is proportional to the activities of signaling proteins, which is unlikely in the complex biological systems. Therefore, we propose a power-law model to relate the activities of all the measured signaling proteins to the probabilities of cell fates. In our experiments, we compared our nonlinear power-law model with the linear model on three cancer datasets with phosphoproteomics and cell fate measurements, which demonstrated that the nonlinear model has superior performance on cell fates prediction. By in silico simulation of virtual protein knock-down, the proposed model is able to reveal drug effects which can complement traditional approaches such as binding affinity analysis. Moreover, our model is able to capture cell line specific information to distinguish one cell line from another in cell fate prediction. Our results show that the power-law data-driven model is able to perform better in cell fate prediction and provide more insights into the signaling pathways for cancer cell fates than the linear model. PMID:27764199

  10. Data Driven Model Development for the Supersonic Semispan Transport (S(sup 4)T)

    NASA Technical Reports Server (NTRS)

    Kukreja, Sunil L.

    2011-01-01

    We investigate two common approaches to model development for robust control synthesis in the aerospace community; namely, reduced order aeroservoelastic modelling based on structural finite-element and computational fluid dynamics based aerodynamic models and a data-driven system identification procedure. It is shown via analysis of experimental Super- Sonic SemiSpan Transport (S4T) wind-tunnel data using a system identification approach it is possible to estimate a model at a fixed Mach, which is parsimonious and robust across varying dynamic pressures.

  11. Numerical Investigations of Capabilities and Limits of Photospheric Data Driven Magnetic Flux Emergence

    NASA Astrophysics Data System (ADS)

    Linton, Mark; Leake, James; Schuck, Peter W.

    2016-05-01

    The magnetic field of the solar atmosphere is the primary driver of solar activity. Understanding the magnetic state of the solar atmosphere is therefore of key importance to predicting solaractivity. One promising means of studying the magnetic atmosphere is to dynamically build up and evolve this atmosphere from the time evolution of the magnetic field at the photosphere, where it can be measured with current solar vector magnetograms at high temporal and spatial resolution.We report here on a series of numerical experiments investigating the capabilities and limits of magnetohydrodynamical simulations of such a process, where a magnetic corona is dynamically built up and evolved from a time series of synthetic photospheric data. These synthetic data are composed of photospheric slices taken from self consistent convection zone to corona simulations of flux emergence. The driven coronae are then quantitatively compared against the coronae of the original simulations. We investigate and report on the fidelity of these driven simulations, both as a function of the emergence timescale of the magnetic flux, and as a function of the driving cadence of the input data.This work was supported by the Chief of Naval Research and the NASA Living with a Star and Heliophysics Supporting Research programs.

  12. Multi-processor including data flow accelerator module

    DOEpatents

    Davidson, George S.; Pierce, Paul E.

    1990-01-01

    An accelerator module for a data flow computer includes an intelligent memory. The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer. The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph. Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand. Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue. Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses. The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein. A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode.

  13. Combining TerraSAR-X and SPOT-5 data for object-based landslide detection

    NASA Astrophysics Data System (ADS)

    Friedl, B.; Hölbling, D.; Füreder, P.

    2012-04-01

    Landslide detection and classification is an essential requirement in pre- and post-disaster hazard analysis. In earlier studies landslide detection often was achieved through time-consuming and cost-intensive field surveys and visual orthophoto interpretation. Recent studies show that Earth Observation (EO) data offer new opportunities for fast, reliable and accurate landslide detection and classification, which may conduce to an effective landslide monitoring and landslide hazard management. To ensure the fast recognition and classification of landslides at a regional scale, a (semi-)automated object-based landslide detection approach is established for a study site situated in the Huaguoshan catchment, Southern Taiwan. The study site exhibits a high vulnerability to landslides and debris flows, which are predominantly typhoon-induced. Through the integration of optical satellite data (SPOT-5 with 2.5 m GSD), SAR (Synthetic Aperture Radar) data (TerraSAR-X Spotlight with 2.95 m GSD) and digital elevation information (DEM with 5 m GSD) including its derived products (e.g. slope, curvature, flow accumulation) landslides may be examined in a more efficient way as if relying on single data sources only. The combination of optical and SAR data in an object-based image analysis (OBIA) domain for landslide detection and classification has not been investigated so far, even if SAR imagery show valuable properties for landslide detection, which differ from optical data (e.g. high sensitivity to surface roughness and soil moisture). The main purpose of this study is to recognize and analyze existing landslides by applying object-based image analysis making use of eCognition software. OBIA provides a framework for examining features defined by spectral, spatial, textural, contextual as well as hierarchical properties. Objects are derived through image segmentation and serve as input for the classification process, which relies on transparent rulesets, representing knowledge

  14. Data-Driven Geospatial Visual Analytics for Real-Time Urban Flooding Decision Support

    NASA Astrophysics Data System (ADS)

    Liu, Y.; Hill, D.; Rodriguez, A.; Marini, L.; Kooper, R.; Myers, J.; Wu, X.; Minsker, B. S.

    2009-12-01

    Urban flooding is responsible for the loss of life and property as well as the release of pathogens and other pollutants into the environment. Previous studies have shown that spatial distribution of intense rainfall significantly impacts the triggering and behavior of urban flooding. However, no general purpose tools yet exist for deriving rainfall data and rendering them in real-time at the resolution of hydrologic units used for analyzing urban flooding. This paper presents a new visual analytics system that derives and renders rainfall data from the NEXRAD weather radar system at the sewershed (i.e. urban hydrologic unit) scale in real-time for a Chicago stormwater management project. We introduce a lightweight Web 2.0 approach which takes advantages of scientific workflow management and publishing capabilities developed at NCSA (National Center for Supercomputing Applications), streaming data-aware semantic content management repository, web-based Google Earth/Map and time-aware KML (Keyhole Markup Language). A collection of polygon-based virtual sensors is created from the NEXRAD Level II data using spatial, temporal and thematic transformations at the sewershed level in order to produce persistent virtual rainfall data sources for the animation. Animated color-coded rainfall map in the sewershed can be played in real-time as a movie using time-aware KML inside the web browser-based Google Earth for visually analyzing the spatiotemporal patterns of the rainfall intensity in the sewershed. Such system provides valuable information for situational awareness and improved decision support during extreme storm events in an urban area. Our further work includes incorporating additional data (such as basement flooding events data) or physics-based predictive models that can be used for more integrated data-driven decision support.

  15. Assessment Data-Driven Inquiry: A Review of How to Use Assessment Results to Inform Chemistry Teaching

    ERIC Educational Resources Information Center

    Harshman, Jordan; Yezierski, Ellen

    2017-01-01

    With abundant access to assessments of all kinds, many high school chemistry teachers have the opportunity to gather data from their students on a daily basis. This data can serve multiple purposes, such as informing teachers of students' content difficulties and guiding instruction in a process of data-driven inquiry. In this paper, 83 resources…

  16. a Task-Driven Disaster Data Link Approach

    NASA Astrophysics Data System (ADS)

    Qiu, L. Y.; Zhu, Q.; Gu, J. Y.; Du, Z. Q.

    2015-08-01

    With the rapid development of sensor networks and Earth observation technology, a large quantity of disaster-related data is available, such as remotely sensed data, historic data, cases data, simulation data, disaster products and so on. However, the efficiency of current data management and service systems has become increasingly serious due to the task variety and heterogeneous data. For emergency task-oriented applications, data searching mainly relies on artificial experience based on simple metadata index, whose high time-consuming and low accuracy cannot satisfy the requirements of disaster products on velocity and veracity. In this paper, a task-oriented linking method is proposed for efficient disaster data management and intelligent service, with the objectives of 1) putting forward ontologies of disaster task and data to unify the different semantics of multi-source information, 2) identifying the semantic mapping from emergency tasks to multiple sources on the basis of uniform description in 1), 3) linking task-related data automatically and calculating the degree of correlation between each data and a target task. The method breaks through traditional static management of disaster data and establishes a base for intelligent retrieval and active push of disaster information. The case study presented in this paper illustrates the use of the method with a flood emergency relief task.

  17. Parallel checksumming of data chunks of a shared data object using a log-structured file system

    DOEpatents

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2016-09-06

    Checksum values are generated and used to verify the data integrity. A client executing in a parallel computing system stores a data chunk to a shared data object on a storage node in the parallel computing system. The client determines a checksum value for the data chunk; and provides the checksum value with the data chunk to the storage node that stores the shared object. The data chunk can be stored on the storage node with the corresponding checksum value as part of the shared object. The storage node may be part of a Parallel Log-Structured File System (PLFS), and the client may comprise, for example, a Log-Structured File System client on a compute node or burst buffer. The checksum value can be evaluated when the data chunk is read from the storage node to verify the integrity of the data that is read.

  18. A Hybrid Physics-Based Data-Driven Approach for Point-Particle Force Modeling

    NASA Astrophysics Data System (ADS)

    Moore, Chandler; Akiki, Georges; Balachandar, S.

    2017-11-01

    This study improves upon the physics-based pairwise interaction extended point-particle (PIEP) model. The PIEP model leverages a physical framework to predict fluid mediated interactions between solid particles. While the PIEP model is a powerful tool, its pairwise assumption leads to increased error in flows with high particle volume fractions. To reduce this error, a regression algorithm is used to model the differences between the current PIEP model's predictions and the results of direct numerical simulations (DNS) for an array of monodisperse solid particles subjected to various flow conditions. The resulting statistical model and the physical PIEP model are superimposed to construct a hybrid, physics-based data-driven PIEP model. It must be noted that the performance of a pure data-driven approach without the model-form provided by the physical PIEP model is substantially inferior. The hybrid model's predictive capabilities are analyzed using more DNS. In every case tested, the hybrid PIEP model's prediction are more accurate than those of physical PIEP model. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1315138 and the U.S. DOE, NNSA, ASC Program, as a Cooperative Agreement under Contract No. DE-NA0002378.

  19. The effects of data-driven learning activities on EFL learners' writing development.

    PubMed

    Luo, Qinqin

    2016-01-01

    Data-driven learning has been proved as an effective approach in helping learners solve various writing problems such as correcting lexical or grammatical errors, improving the use of collocations and generating ideas in writing, etc. This article reports on an empirical study in which data-driven learning was accomplished with the assistance of the user-friendly BNCweb, and presents the evaluation of the outcome by comparing the effectiveness of BNCweb and a search engine Baidu which is most commonly used as reference resource by Chinese learners of English as a foreign language. The quantitative results about 48 Chinese college students revealed that the experimental group which used BNCweb performed significantly better in the post-test in terms of writing fluency and accuracy, as compared with the control group which used the search engine Baidu. However, no significant difference was found between the two groups in terms of writing complexity. The qualitative results about the interview revealed that learners generally showed a positive attitude toward the use of BNCweb but there were still some problems of using corpora in the writing process, thus the combined use of corpora and other types of reference resource was suggested as a possible way to counter the potential barriers for Chinese learners of English.

  20. Long-term Science Data Curation Using a Digital Object Model and Open-Source Frameworks

    NASA Astrophysics Data System (ADS)

    Pan, J.; Lenhardt, W.; Wilson, B. E.; Palanisamy, G.; Cook, R. B.

    2010-12-01

    Scientific digital content, including Earth Science observations and model output, has become more heterogeneous in format and more distributed across the Internet. In addition, data and metadata are becoming necessarily linked internally and externally on the Web. As a result, such content has become more difficult for providers to manage and preserve and for users to locate, understand, and consume. Specifically, it is increasingly harder to deliver relevant metadata and data processing lineage information along with the actual content consistently. Readme files, data quality information, production provenance, and other descriptive metadata are often separated in the storage level as well as in the data search and retrieval interfaces available to a user. Critical archival metadata, such as auditing trails and integrity checks, are often even more difficult for users to access, if they exist at all. We investigate the use of several open-source software frameworks to address these challenges. We use Fedora Commons Framework and its digital object abstraction as the repository, Drupal CMS as the user-interface, and the Islandora module as the connector from Drupal to Fedora Repository. With the digital object model, metadata of data description and data provenance can be associated with data content in a formal manner, so are external references and other arbitrary auxiliary information. Changes are formally audited on an object, and digital contents are versioned and have checksums automatically computed. Further, relationships among objects are formally expressed with RDF triples. Data replication, recovery, metadata export are supported with standard protocols, such as OAI-PMH. We provide a tentative comparative analysis of the chosen software stack with the Open Archival Information System (OAIS) reference model, along with our initial results with the existing terrestrial ecology data collections at NASA’s ORNL Distributed Active Archive Center for

  1. Teacher Talk about Student Ability and Achievement in the Era of Data-Driven Decision Making

    ERIC Educational Resources Information Center

    Datnow, Amanda; Choi, Bailey; Park, Vicki; St. John, Elise

    2018-01-01

    Background: Data-driven decision making continues to be a common feature of educational reform agendas across the globe. In many U.S. schools, the teacher team meeting is a key setting in which data use is intended to take place, with the aim of planning instruction to address students' needs. However, most prior research has not examined how the…

  2. A Hypothesis-Driven Approach to Site Investigation

    NASA Astrophysics Data System (ADS)

    Nowak, W.

    2008-12-01

    Variability of subsurface formations and the scarcity of data lead to the notion of aquifer parameters as geostatistical random variables. Given an information need and limited resources for field campaigns, site investigation is often put into the context of optimal design. In optimal design, the types, numbers and positions of samples are optimized under case-specific objectives to meet the information needs. Past studies feature optimal data worth (balancing maximum financial profit in an engineering task versus the cost of additional sampling), or aim at a minimum prediction uncertainty of stochastic models for a prescribed investigation budget. Recent studies also account for other sources of uncertainty outside the hydrogeological range, such as uncertain toxicity, ingestion and behavioral parameters of the affected population when predicting the human health risk from groundwater contaminations. The current study looks at optimal site investigation from a new angle. Answering a yes/no question under uncertainty directly requires recasting the original question as a hypothesis test. Otherwise, false confidence in the resulting answer would be pretended. A straightforward example is whether a recent contaminant spill will cause contaminant concentrations in excess of a legal limit at a nearby drinking water well. This question can only be answered down to a specified chance of error, i.e., based on the significance level used in hypothesis tests. Optimal design is placed into the hypothesis-driven context by using the chance of providing a false yes/no answer as new criterion to be minimized. Different configurations apply for one-sided and two-sided hypothesis tests. If a false answer entails financial liability, the hypothesis-driven context can be re-cast in the context of data worth. The remaining difference is that failure is a hard constraint in the data worth context versus a monetary punishment term in the hypothesis-driven context. The basic principle

  3. Ontology-Based Retrieval of Spatially Related Objects for Location Based Services

    NASA Astrophysics Data System (ADS)

    Haav, Hele-Mai; Kaljuvee, Aivi; Luts, Martin; Vajakas, Toivo

    Advanced Location Based Service (LBS) applications have to integrate information stored in GIS, information about users' preferences (profile) as well as contextual information and information about application itself. Ontology engineering provides methods to semantically integrate several data sources. We propose an ontology-driven LBS development framework: the paper describes the architecture of ontologies and their usage for retrieval of spatially related objects relevant to the user. Our main contribution is to enable personalised ontology driven LBS by providing a novel approach for defining personalised semantic spatial relationships by means of ontologies. The approach is illustrated by an industrial case study.

  4. Automatic labeling and characterization of objects using artificial neural networks

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Hill, Scott E.; Cromp, Robert F.

    1989-01-01

    Existing NASA supported scientific data bases are usually developed, managed and populated in a tedious, error prone and self-limiting way in terms of what can be described in a relational Data Base Management System (DBMS). The next generation Earth remote sensing platforms, i.e., Earth Observation System, (EOS), will be capable of generating data at a rate of over 300 Mbs per second from a suite of instruments designed for different applications. What is needed is an innovative approach that creates object-oriented databases that segment, characterize, catalog and are manageable in a domain-specific context and whose contents are available interactively and in near-real-time to the user community. Described here is work in progress that utilizes an artificial neural net approach to characterize satellite imagery of undefined objects into high-level data objects. The characterized data is then dynamically allocated to an object-oriented data base where it can be reviewed and assessed by a user. The definition, development, and evolution of the overall data system model are steps in the creation of an application-driven knowledge-based scientific information system.

  5. Data-Driven Hint Generation in Vast Solution Spaces: A Self-Improving Python Programming Tutor

    ERIC Educational Resources Information Center

    Rivers, Kelly; Koedinger, Kenneth R.

    2017-01-01

    To provide personalized help to students who are working on code-writing problems, we introduce a data-driven tutoring system, ITAP (Intelligent Teaching Assistant for Programming). ITAP uses state abstraction, path construction, and state reification to automatically generate personalized hints for students, even when given states that have not…

  6. Motion Pattern Encapsulation for Data-Driven Constraint-Based Motion Editing

    NASA Astrophysics Data System (ADS)

    Carvalho, Schubert R.; Boulic, Ronan; Thalmann, Daniel

    The growth of motion capture systems have contributed to the proliferation of human motion database, mainly because human motion is important in many applications, ranging from games entertainment and films to sports and medicine. However, the captured motions normally attend specific needs. As an effort for adapting and reusing captured human motions in new tasks and environments and improving the animator's work, we present and discuss a new data-driven constraint-based animation system for interactive human motion editing. This method offers the compelling advantage that it provides faster deformations and more natural-looking motion results compared to goal-directed constraint-based methods found in the literature.

  7. Advanced software development workstation. Comparison of two object-oriented development methodologies

    NASA Technical Reports Server (NTRS)

    Izygon, Michel E.

    1992-01-01

    This report is an attempt to clarify some of the concerns raised about the OMT method, specifically that OMT is weaker than the Booch method in a few key areas. This interim report specifically addresses the following issues: (1) is OMT object-oriented or only data-driven?; (2) can OMT be used as a front-end to implementation in C++?; (3) the inheritance concept in OMT is in contradiction with the 'pure and real' inheritance concept found in object-oriented (OO) design; (4) low support for software life-cycle issues, for project and risk management; (5) uselessness of functional modeling for the ROSE project; and (6) problems with event-driven and simulation systems. The conclusion of this report is that both Booch's method and Rumbaugh's method are good OO methods, each with strengths and weaknesses in different areas of the development process.

  8. Data-driven simultaneous fault diagnosis for solid oxide fuel cell system using multi-label pattern identification

    NASA Astrophysics Data System (ADS)

    Li, Shuanghong; Cao, Hongliang; Yang, Yupu

    2018-02-01

    Fault diagnosis is a key process for the reliability and safety of solid oxide fuel cell (SOFC) systems. However, it is difficult to rapidly and accurately identify faults for complicated SOFC systems, especially when simultaneous faults appear. In this research, a data-driven Multi-Label (ML) pattern identification approach is proposed to address the simultaneous fault diagnosis of SOFC systems. The framework of the simultaneous-fault diagnosis primarily includes two components: feature extraction and ML-SVM classifier. The simultaneous-fault diagnosis approach can be trained to diagnose simultaneous SOFC faults, such as fuel leakage, air leakage in different positions in the SOFC system, by just using simple training data sets consisting only single fault and not demanding simultaneous faults data. The experimental result shows the proposed framework can diagnose the simultaneous SOFC system faults with high accuracy requiring small number training data and low computational burden. In addition, Fault Inference Tree Analysis (FITA) is employed to identify the correlations among possible faults and their corresponding symptoms at the system component level.

  9. Event-driven simulation in SELMON: An overview of EDSE

    NASA Technical Reports Server (NTRS)

    Rouquette, Nicolas F.; Chien, Steve A.; Charest, Leonard, Jr.

    1992-01-01

    EDSE (event-driven simulation engine), a model-based event-driven simulator implemented for SELMON, a tool for sensor selection and anomaly detection in real-time monitoring is described. The simulator is used in conjunction with a causal model to predict future behavior of the model from observed data. The behavior of the causal model is interpreted as equivalent to the behavior of the physical system being modeled. An overview of the functionality of the simulator and the model-based event-driven simulation paradigm on which it is based is provided. Included are high-level descriptions of the following key properties: event consumption and event creation, iterative simulation, synchronization and filtering of monitoring data from the physical system. Finally, how EDSE stands with respect to the relevant open issues of discrete-event and model-based simulation is discussed.

  10. Nursing Theory, Terminology, and Big Data: Data-Driven Discovery of Novel Patterns in Archival Randomized Clinical Trial Data.

    PubMed

    Monsen, Karen A; Kelechi, Teresa J; McRae, Marion E; Mathiason, Michelle A; Martin, Karen S

    The growth and diversification of nursing theory, nursing terminology, and nursing data enable a convergence of theory- and data-driven discovery in the era of big data research. Existing datasets can be viewed through theoretical and terminology perspectives using visualization techniques in order to reveal new patterns and generate hypotheses. The Omaha System is a standardized terminology and metamodel that makes explicit the theoretical perspective of the nursing discipline and enables terminology-theory testing research. The purpose of this paper is to illustrate the approach by exploring a large research dataset consisting of 95 variables (demographics, temperature measures, anthropometrics, and standardized instruments measuring quality of life and self-efficacy) from a theory-based perspective using the Omaha System. Aims were to (a) examine the Omaha System dataset to understand the sample at baseline relative to Omaha System problem terms and outcome measures, (b) examine relationships within the normalized Omaha System dataset at baseline in predicting adherence, and (c) examine relationships within the normalized Omaha System dataset at baseline in predicting incident venous ulcer. Variables from a randomized clinical trial of a cryotherapy intervention for the prevention of venous ulcers were mapped onto Omaha System terms and measures to derive a theoretical framework for the terminology-theory testing study. The original dataset was recoded using the mapping to create an Omaha System dataset, which was then examined using visualization to generate hypotheses. The hypotheses were tested using standard inferential statistics. Logistic regression was used to predict adherence and incident venous ulcer. Findings revealed novel patterns in the psychosocial characteristics of the sample that were discovered to be drivers of both adherence (Mental health Behavior: OR = 1.28, 95% CI [1.02, 1.60]; AUC = .56) and incident venous ulcer (Mental health Behavior

  11. Improvements in estimating proportions of objects from multispectral data

    NASA Technical Reports Server (NTRS)

    Horwitz, H. M.; Hyde, P. D.; Richardson, W.

    1974-01-01

    Methods for estimating proportions of objects and materials imaged within the instantaneous field of view of a multispectral sensor were developed further. Improvements in the basic proportion estimation algorithm were devised as well as improved alien object detection procedures. Also, a simplified signature set analysis scheme was introduced for determining the adequacy of signature set geometry for satisfactory proportion estimation. Averaging procedures used in conjunction with the mixtures algorithm were examined theoretically and applied to artificially generated multispectral data. A computationally simpler estimator was considered and found unsatisfactory. Experiments conducted to find a suitable procedure for setting the alien object threshold yielded little definitive result. Mixtures procedures were used on a limited amount of ERTS data to estimate wheat proportion in selected areas. Results were unsatisfactory, partly because of the ill-conditioned nature of the pure signature set.

  12. New data-driven estimation of terrestrial CO2 fluxes in Asia using a standardized database of eddy covariance measurements, remote sensing data, and support vector regression

    NASA Astrophysics Data System (ADS)

    Ichii, Kazuhito; Ueyama, Masahito; Kondo, Masayuki; Saigusa, Nobuko; Kim, Joon; Alberto, Ma. Carmelita; Ardö, Jonas; Euskirchen, Eugénie S.; Kang, Minseok; Hirano, Takashi; Joiner, Joanna; Kobayashi, Hideki; Marchesini, Luca Belelli; Merbold, Lutz; Miyata, Akira; Saitoh, Taku M.; Takagi, Kentaro; Varlagin, Andrej; Bret-Harte, M. Syndonia; Kitamura, Kenzo; Kosugi, Yoshiko; Kotani, Ayumi; Kumar, Kireet; Li, Sheng-Gong; Machimura, Takashi; Matsuura, Yojiro; Mizoguchi, Yasuko; Ohta, Takeshi; Mukherjee, Sandipan; Yanagi, Yuji; Yasuda, Yukio; Zhang, Yiping; Zhao, Fenghua

    2017-04-01

    The lack of a standardized database of eddy covariance observations has been an obstacle for data-driven estimation of terrestrial CO2 fluxes in Asia. In this study, we developed such a standardized database using 54 sites from various databases by applying consistent postprocessing for data-driven estimation of gross primary productivity (GPP) and net ecosystem CO2 exchange (NEE). Data-driven estimation was conducted by using a machine learning algorithm: support vector regression (SVR), with remote sensing data for 2000 to 2015 period. Site-level evaluation of the estimated CO2 fluxes shows that although performance varies in different vegetation and climate classifications, GPP and NEE at 8 days are reproduced (e.g., r2 = 0.73 and 0.42 for 8 day GPP and NEE). Evaluation of spatially estimated GPP with Global Ozone Monitoring Experiment 2 sensor-based Sun-induced chlorophyll fluorescence shows that monthly GPP variations at subcontinental scale were reproduced by SVR (r2 = 1.00, 0.94, 0.91, and 0.89 for Siberia, East Asia, South Asia, and Southeast Asia, respectively). Evaluation of spatially estimated NEE with net atmosphere-land CO2 fluxes of Greenhouse Gases Observing Satellite (GOSAT) Level 4A product shows that monthly variations of these data were consistent in Siberia and East Asia; meanwhile, inconsistency was found in South Asia and Southeast Asia. Furthermore, differences in the land CO2 fluxes from SVR-NEE and GOSAT Level 4A were partially explained by accounting for the differences in the definition of land CO2 fluxes. These data-driven estimates can provide a new opportunity to assess CO2 fluxes in Asia and evaluate and constrain terrestrial ecosystem models.

  13. Human body segmentation via data-driven graph cut.

    PubMed

    Li, Shifeng; Lu, Huchuan; Shao, Xingqing

    2014-11-01

    Human body segmentation is a challenging and important problem in computer vision. Existing methods usually entail a time-consuming training phase for prior knowledge learning with complex shape matching for body segmentation. In this paper, we propose a data-driven method that integrates top-down body pose information and bottom-up low-level visual cues for segmenting humans in static images within the graph cut framework. The key idea of our approach is first to exploit human kinematics to search for body part candidates via dynamic programming for high-level evidence. Then, by using the body parts classifiers, obtaining bottom-up cues of human body distribution for low-level evidence. All the evidence collected from top-down and bottom-up procedures are integrated in a graph cut framework for human body segmentation. Qualitative and quantitative experiment results demonstrate the merits of the proposed method in segmenting human bodies with arbitrary poses from cluttered backgrounds.

  14. Object-oriented crop mapping and monitoring using multi-temporal polarimetric RADARSAT-2 data

    NASA Astrophysics Data System (ADS)

    Jiao, Xianfeng; Kovacs, John M.; Shang, Jiali; McNairn, Heather; Walters, Dan; Ma, Baoluo; Geng, Xiaoyuan

    2014-10-01

    The aim of this paper is to assess the accuracy of an object-oriented classification of polarimetric Synthetic Aperture Radar (PolSAR) data to map and monitor crops using 19 RADARSAT-2 fine beam polarimetric (FQ) images of an agricultural area in North-eastern Ontario, Canada. Polarimetric images and field data were acquired during the 2011 and 2012 growing seasons. The classification and field data collection focused on the main crop types grown in the region, which include: wheat, oat, soybean, canola and forage. The polarimetric parameters were extracted with PolSAR analysis using both the Cloude-Pottier and Freeman-Durden decompositions. The object-oriented classification, with a single date of PolSAR data, was able to classify all five crop types with an accuracy of 95% and Kappa of 0.93; a 6% improvement in comparison with linear-polarization only classification. However, the time of acquisition is crucial. The larger biomass crops of canola and soybean were most accurately mapped, whereas the identification of oat and wheat were more variable. The multi-temporal data using the Cloude-Pottier decomposition parameters provided the best classification accuracy compared to the linear polarizations and the Freeman-Durden decomposition parameters. In general, the object-oriented classifications were able to accurately map crop types by reducing the noise inherent in the SAR data. Furthermore, using the crop classification maps we were able to monitor crop growth stage based on a trend analysis of the radar response. Based on field data from canola crops, there was a strong relationship between the phenological growth stage based on the BBCH scale, and the HV backscatter and entropy.

  15. The effect of input data transformations on object-based image analysis

    PubMed Central

    LIPPITT, CHRISTOPHER D.; COULTER, LLOYD L.; FREEMAN, MARY; LAMANTIA-BISHOP, JEFFREY; PANG, WYSON; STOW, DOUGLAS A.

    2011-01-01

    The effect of using spectral transform images as input data on segmentation quality and its potential effect on products generated by object-based image analysis are explored in the context of land cover classification in Accra, Ghana. Five image data transformations are compared to untransformed spectral bands in terms of their effect on segmentation quality and final product accuracy. The relationship between segmentation quality and product accuracy is also briefly explored. Results suggest that input data transformations can aid in the delineation of landscape objects by image segmentation, but the effect is idiosyncratic to the transformation and object of interest. PMID:21673829

  16. Lexical Awareness and Development through Data Driven Learning: Attitudes and Beliefs of EFL Learners

    ERIC Educational Resources Information Center

    Asik, Asuman; Vural, Arzu Sarlanoglu; Akpinar, Kadriye Dilek

    2016-01-01

    Data-driven learning (DDL) has become an innovative approach developed from corpus linguistics. It plays a significant role in the progression of foreign language pedagogy, since it offers learners plentiful authentic corpora examples that make them analyze language rules with the help of online corpora and concordancers. The present study…

  17. Realistic Data-Driven Traffic Flow Animation Using Texture Synthesis.

    PubMed

    Chao, Qianwen; Deng, Zhigang; Ren, Jiaping; Ye, Qianqian; Jin, Xiaogang

    2018-02-01

    We present a novel data-driven approach to populate virtual road networks with realistic traffic flows. Specifically, given a limited set of vehicle trajectories as the input samples, our approach first synthesizes a large set of vehicle trajectories. By taking the spatio-temporal information of traffic flows as a 2D texture, the generation of new traffic flows can be formulated as a texture synthesis process, which is solved by minimizing a newly developed traffic texture energy. The synthesized output captures the spatio-temporal dynamics of the input traffic flows, and the vehicle interactions in it strictly follow traffic rules. After that, we position the synthesized vehicle trajectory data to virtual road networks using a cage-based registration scheme, where a few traffic-specific constraints are enforced to maintain each vehicle's original spatial location and synchronize its motion in concert with its neighboring vehicles. Our approach is intuitive to control and scalable to the complexity of virtual road networks. We validated our approach through many experiments and paired comparison user studies.

  18. Data-driven classification of patients with primary progressive aphasia.

    PubMed

    Hoffman, Paul; Sajjadi, Seyed Ahmad; Patterson, Karalyn; Nestor, Peter J

    2017-11-01

    Current diagnostic criteria classify primary progressive aphasia into three variants-semantic (sv), nonfluent (nfv) and logopenic (lv) PPA-though the adequacy of this scheme is debated. This study took a data-driven approach, applying k-means clustering to data from 43 PPA patients. The algorithm grouped patients based on similarities in language, semantic and non-linguistic cognitive scores. The optimum solution consisted of three groups. One group, almost exclusively those diagnosed as svPPA, displayed a selective semantic impairment. A second cluster, with impairments to speech production, repetition and syntactic processing, contained a majority of patients with nfvPPA but also some lvPPA patients. The final group exhibited more severe deficits to speech, repetition and syntax as well as semantic and other cognitive deficits. These results suggest that, amongst cases of non-semantic PPA, differentiation mainly reflects overall degree of language/cognitive impairment. The observed patterns were scarcely affected by inclusion/exclusion of non-linguistic cognitive scores. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Migraine Subclassification via a Data-Driven Automated Approach Using Multimodality Factor Mixture Modeling of Brain Structure Measurements.

    PubMed

    Schwedt, Todd J; Si, Bing; Li, Jing; Wu, Teresa; Chong, Catherine D

    2017-07-01

    The current subclassification of migraine is according to headache frequency and aura status. The variability in migraine symptoms, disease course, and response to treatment suggest the presence of additional heterogeneity or subclasses within migraine. The study objective was to subclassify migraine via a data-driven approach, identifying latent factors by jointly exploiting multiple sets of brain structural features obtained via magnetic resonance imaging (MRI). Migraineurs (n = 66) and healthy controls (n = 54) had brain MRI measurements of cortical thickness, cortical surface area, and volumes for 68 regions. A multimodality factor mixture model was used to subclassify MRIs and to determine the brain structural factors that most contributed to the subclassification. Clinical characteristics of subjects in each subgroup were compared. Automated MRI classification divided the subjects into two subgroups. Migraineurs in subgroup #1 had more severe allodynia symptoms during migraines (6.1 ± 5.3 vs. 3.6 ± 3.2, P = .03), more years with migraine (19.2 ± 11.3 years vs 13 ± 8.3 years, P = .01), and higher Migraine Disability Assessment (MIDAS) scores (25 ± 22.9 vs 15.7 ± 12.2, P = .04). There were not differences in headache frequency or migraine aura status between the two subgroups. Data-driven subclassification of brain MRIs based upon structural measurements identified two subgroups. Amongst migraineurs, the subgroups differed in allodynia symptom severity, years with migraine, and migraine-related disability. Since allodynia is associated with this imaging-based subclassification of migraine and prior publications suggest that allodynia impacts migraine treatment response and disease prognosis, future migraine diagnostic criteria could consider allodynia when defining migraine subgroups. © 2017 American Headache Society.

  20. How much are Chevrolet Volts in The EV Project driven in EV Mode?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smart, John

    2013-08-01

    This report summarizes key conclusions from analysis of data collected from Chevrolet Volts participating in The EV Project. Topics include how many miles are driven in EV mode, how far vehicles are driven between charging events, and how much energy is charged from the electric grid per charging event.

  1. Impact of Data Placement on Resilience in Large-Scale Object Storage Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carns, Philip; Harms, Kevin; Jenkins, John

    Distributed object storage architectures have become the de facto standard for high-performance storage in big data, cloud, and HPC computing. Object storage deployments using commodity hardware to reduce costs often employ object replication as a method to achieve data resilience. Repairing object replicas after failure is a daunting task for systems with thousands of servers and billions of objects, however, and it is increasingly difficult to evaluate such scenarios at scale on realworld systems. Resilience and availability are both compromised if objects are not repaired in a timely manner. In this work we leverage a high-fidelity discrete-event simulation model tomore » investigate replica reconstruction on large-scale object storage systems with thousands of servers, billions of objects, and petabytes of data. We evaluate the behavior of CRUSH, a well-known object placement algorithm, and identify configuration scenarios in which aggregate rebuild performance is constrained by object placement policies. After determining the root cause of this bottleneck, we then propose enhancements to CRUSH and the usage policies atop it to enable scalable replica reconstruction. We use these methods to demonstrate a simulated aggregate rebuild rate of 410 GiB/s (within 5% of projected ideal linear scaling) on a 1,024-node commodity storage system. We also uncover an unexpected phenomenon in rebuild performance based on the characteristics of the data stored on the system.« less

  2. Improving the Fitness of High-Dimensional Biomechanical Models via Data-Driven Stochastic Exploration

    PubMed Central

    Bustamante, Carlos D.; Valero-Cuevas, Francisco J.

    2010-01-01

    The field of complex biomechanical modeling has begun to rely on Monte Carlo techniques to investigate the effects of parameter variability and measurement uncertainty on model outputs, search for optimal parameter combinations, and define model limitations. However, advanced stochastic methods to perform data-driven explorations, such as Markov chain Monte Carlo (MCMC), become necessary as the number of model parameters increases. Here, we demonstrate the feasibility and, what to our knowledge is, the first use of an MCMC approach to improve the fitness of realistically large biomechanical models. We used a Metropolis–Hastings algorithm to search increasingly complex parameter landscapes (3, 8, 24, and 36 dimensions) to uncover underlying distributions of anatomical parameters of a “truth model” of the human thumb on the basis of simulated kinematic data (thumbnail location, orientation, and linear and angular velocities) polluted by zero-mean, uncorrelated multivariate Gaussian “measurement noise.” Driven by these data, ten Markov chains searched each model parameter space for the subspace that best fit the data (posterior distribution). As expected, the convergence time increased, more local minima were found, and marginal distributions broadened as the parameter space complexity increased. In the 36-D scenario, some chains found local minima but the majority of chains converged to the true posterior distribution (confirmed using a cross-validation dataset), thus demonstrating the feasibility and utility of these methods for realistically large biomechanical problems. PMID:19272906

  3. A Standard-Driven Data Dictionary for Data Harmonization of Heterogeneous Datasets in Urban Geological Information Systems

    NASA Astrophysics Data System (ADS)

    Liu, G.; Wu, C.; Li, X.; Song, P.

    2013-12-01

    The 3D urban geological information system has been a major part of the national urban geological survey project of China Geological Survey in recent years. Large amount of multi-source and multi-subject data are to be stored in the urban geological databases. There are various models and vocabularies drafted and applied by industrial companies in urban geological data. The issues such as duplicate and ambiguous definition of terms and different coding structure increase the difficulty of information sharing and data integration. To solve this problem, we proposed a national standard-driven information classification and coding method to effectively store and integrate urban geological data, and we applied the data dictionary technology to achieve structural and standard data storage. The overall purpose of this work is to set up a common data platform to provide information sharing service. Research progresses are as follows: (1) A unified classification and coding method for multi-source data based on national standards. Underlying national standards include GB 9649-88 for geology and GB/T 13923-2006 for geography. Current industrial models are compared with national standards to build a mapping table. The attributes of various urban geological data entity models are reduced to several categories according to their application phases and domains. Then a logical data model is set up as a standard format to design data file structures for a relational database. (2) A multi-level data dictionary for data standardization constraint. Three levels of data dictionary are designed: model data dictionary is used to manage system database files and enhance maintenance of the whole database system; attribute dictionary organizes fields used in database tables; term and code dictionary is applied to provide a standard for urban information system by adopting appropriate classification and coding methods; comprehensive data dictionary manages system operation and security. (3

  4. Rule Driven Multi-Objective Management (RDMOM) - An Alternative Form for Describing and Developing Effective Water Resources Management Strategies

    NASA Astrophysics Data System (ADS)

    Sheer, D. P.

    2011-12-01

    Economics provides a model for describing human behavior applied to the management of water resources, but that model assumes, among other things, that managers have a way of directly relating immediate actions to long-term economic outcomes. This is rarely the case in water resources problems where uncertainty has significant impacts on the effectiveness of management strategies and where the management objectives are very difficult to commensurate. The difficulty in using economics is even greater in multiparty disputes, where each party has a different relative value for each of the management objectives, and many of the management objectives are shared. A three step approach to collaborative decision making can overcome these difficulties. The first step involves creating science based performance measures and evaluation tools to estimate the effect of alternative management strategies on each of the non-commensurate objectives. The second step involves developing short-term surrogate operating objectives that implicitly deal with all of the aspects of the long term uncertainty. Management that continually "optimizes" the short-term objectives subject to physical and other constraints that change through time can be characterized as Rule Driven Multi-Objective Management (RDMOM). RDMOM strategies are then tested in simulation models to provide the basis for evaluating performance measures. Participants in the collaborative process then engage in multiparty discussions that create new alternatives, and "barter" a deal. RDMOM does not assume that managers fully understand the link between current actions and long term goals. Rather, it assumes that managers operate to achieve short-term surrogate objectives which they believe will achieve an appropriate balance of both short and long-term incommensurable benefits. A reservoir rule curve is a simple, but often not particularly effective, example of the real-world implementation of RDMOM. Water managers find they

  5. Evaluation of global water quality - the potential of a data- and model-driven analysis

    NASA Astrophysics Data System (ADS)

    Bärlund, Ilona; Flörke, Martina; Alcamo, Joseph; Völker, Jeanette; Malsy, Marcus; Kaus, Andrew; Reder, Klara; Büttner, Olaf; Katterfeld, Christiane; Dietrich, Désirée; Borchardt, Dietrich

    2016-04-01

    The ongoing socio-economic development presents a new challenge for water quality worldwide, especially in developing and emerging countries. It is estimated that due to population growth and the extension of water supply networks, the amount of waste water will rise sharply. This can lead to an increased risk of surface water quality degradation, if the wastewater is not sufficiently treated. This development has impacts on ecosystems and human health, as well as food security. The United Nations Member States have adopted targets for sustainable development. They include, inter alia, sustainable protection of water quality and sustainable use of water resources. To achieve these goals, appropriate monitoring strategies and the development of indicators for water quality are required. Within the pre-study for a 'World Water Quality Assessment' (WWQA) led by United Nations Environment Programme (UNEP), a methodology for assessing water quality, taking into account the above-mentioned objectives has been developed. The novelty of this methodology is the linked model- and data-driven approach. The focus is on parameters reflecting the key water quality issues, such as increased waste water pollution, salinization or eutrophication. The results from the pre-study show, for example, that already about one seventh of all watercourses in Latin America, Africa and Asia show high organic pollution. This is of central importance for inland fisheries and associated food security. In addition, it could be demonstrated that global water quality databases have large gaps. These must be closed in the future in order to obtain an overall picture of global water quality and to target measures more efficiently. The aim of this presentation is to introduce the methodology developed within the WWQA pre-study and to show selected examples of application in Latin America, Africa and Asia.

  6. A Hybrid Knowledge-Based and Data-Driven Approach to Identifying Semantically Similar Concepts

    PubMed Central

    Pivovarov, Rimma; Elhadad, Noémie

    2012-01-01

    An open research question when leveraging ontological knowledge is when to treat different concepts separately from each other and when to aggregate them. For instance, concepts for the terms "paroxysmal cough" and "nocturnal cough" might be aggregated in a kidney disease study, but should be left separate in a pneumonia study. Determining whether two concepts are similar enough to be aggregated can help build better datasets for data mining purposes and avoid signal dilution. Quantifying the similarity among concepts is a difficult task, however, in part because such similarity is context-dependent. We propose a comprehensive method, which computes a similarity score for a concept pair by combining data-driven and ontology-driven knowledge. We demonstrate our method on concepts from SNOMED-CT and on a corpus of clinical notes of patients with chronic kidney disease. By combining information from usage patterns in clinical notes and from ontological structure, the method can prune out concepts that are simply related from those which are semantically similar. When evaluated against a list of concept pairs annotated for similarity, our method reaches an AUC (area under the curve) of 92%. PMID:22289420

  7. DATA QUALITY OBJECTIVES-FOUNDATION OF A SUCCESSFUL MONITORING PROGRAM

    EPA Science Inventory

    The data quality objectives (DQO) process is a fundamental site characterization tool and the foundation of a successful monitoring program. The DQO process is a systematic planning approach based on the scientific method of inquiry. The process identifies the goals of data col...

  8. Data-driven asthma endotypes defined from blood biomarker and gene expression data

    EPA Science Inventory

    The diagnosis and treatment of childhood asthma is complicated by its mechanistically distinct subtypes (endotypes) driven by genetic susceptibility and modulating environmental factors. Clinical biomarkers and blood gene expression were collected from a stratified, cross-section...

  9. Multi-objective Optimization of Solar-driven Hollow-fiber Membrane Distillation Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nenoff, Tina M.; Moore, Sarah E.; Mirchandani, Sera

    Securing additional water sources remains a primary concern for arid regions in both the developed and developing world. Climate change is causing fluctuations in the frequency and duration of precipitation, which can be can be seen as prolonged droughts in some arid areas. Droughts decrease the reliability of surface water supplies, which forces communities to find alternate primary water sources. In many cases, ground water can supplement the use of surface supplies during periods of drought, reducing the need for above-ground storage without sacrificing reliability objectives. Unfortunately, accessible ground waters are often brackish, requiring desalination prior to use, and underdevelopedmore » infrastructure and inconsistent electrical grid access can create obstacles to groundwater desalination in developing regions. The objectives of the proposed project are to (i) mathematically simulate the operation of hollow fiber membrane distillation systems and (ii) optimize system design for off-grid treatment of brackish water. It is anticipated that methods developed here can be used to supply potable water at many off-grid locations in semi-arid regions including parts of the Navajo Reservation. This research is a collaborative project between Sandia and the University of Arizona.« less

  10. Data-driven approach for assessing utility of medical tests using electronic medical records.

    PubMed

    Skrøvseth, Stein Olav; Augestad, Knut Magne; Ebadollahi, Shahram

    2015-02-01

    To precisely define the utility of tests in a clinical pathway through data-driven analysis of the electronic medical record (EMR). The information content was defined in terms of the entropy of the expected value of the test related to a given outcome. A kernel density classifier was used to estimate the necessary distributions. To validate the method, we used data from the EMR of the gastrointestinal department at a university hospital. Blood tests from patients undergoing surgery for gastrointestinal surgery were analyzed with respect to second surgery within 30 days of the index surgery. The information content is clearly reflected in the patient pathway for certain combinations of tests and outcomes. C-reactive protein tests coupled to anastomosis leakage, a severe complication show a clear pattern of information gain through the patient trajectory, where the greatest gain from the test is 3-4 days post index surgery. We have defined the information content in a data-driven and information theoretic way such that the utility of a test can be precisely defined. The results reflect clinical knowledge. In the case we used the tests carry little negative impact. The general approach can be expanded to cases that carry a substantial negative impact, such as in certain radiological techniques. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Test-driven programming

    NASA Astrophysics Data System (ADS)

    Georgiev, Bozhidar; Georgieva, Adriana

    2013-12-01

    In this paper, are presented some possibilities concerning the implementation of a test-driven development as a programming method. Here is offered a different point of view for creation of advanced programming techniques (build tests before programming source with all necessary software tools and modules respectively). Therefore, this nontraditional approach for easier programmer's work through building tests at first is preferable way of software development. This approach allows comparatively simple programming (applied with different object-oriented programming languages as for example JAVA, XML, PYTHON etc.). It is predictable way to develop software tools and to provide help about creating better software that is also easier to maintain. Test-driven programming is able to replace more complicated casual paradigms, used by many programmers.

  12. Selecting a Persistent Data Support Environment for Object-Oriented Applications

    DTIC Science & Technology

    1998-03-01

    key features of most object DBMS products is contained in the <DWAS 9{eeds Assessment for Objects from Barry and Associates. The developer should...data structure and behavior in a self- contained module enhances maintainability of the system and promotes reuse of modules for similar domains...considered together, represent a survey of commercial object-oriented database management systems. These references contain detailed information needed

  13. Data-driven nutrient analysis and reality check: Human inputs, catchment delivery and management effects

    NASA Astrophysics Data System (ADS)

    Destouni, G.

    2017-12-01

    Measures for mitigating nutrient loads to aquatic ecosystems should have observable effects, e.g, in the Baltic region after joint first periods of nutrient management actions under the Baltic Sea Action Plan (BASP; since 2007) and the EU Water Framework Directive (WFD; since 2009). Looking for such observable effects, all openly available water and nutrient monitoring data since 2003 are compiled and analyzed for Sweden as a case study. Results show that hydro-climatically driven water discharge dominates the determination of waterborne loads of both phosphorus and nitrogen. Furthermore, the nutrient loads and water discharge are all similarly well correlated with the ecosystem status classification of Swedish water bodies according to the WFD. Nutrient concentrations, which are hydro-climatically correlated and should thus reflect human effects better than loads, have changed only slightly over the study period (2003-2013) and even increased in moderate-to-bad status waters, where the WFD and BSAP jointly target nutrient decreases. These results indicate insufficient distinction and mitigation of human-driven nutrient components by the internationally harmonized applications of both the WFD and the BSAP. Aiming for better general identification of such components, nutrient data for the large transboundary catchments of the Baltic Sea and the Sava River are compared. The comparison shows cross-regional consistency in nutrient relationships to driving hydro-climatic conditions (water discharge) for nutrient loads, and socio-economic conditions (population density and farmland share) for nutrient concentrations. A data-driven screening methodology is further developed for estimating nutrient input and retention-delivery in catchments. Its first application to nested Sava River catchments identifies characteristic regional values of nutrient input per area and relative delivery, and hotspots of much larger inputs, related to urban high-population areas.

  14. A Multi-layer, Data-driven Advanced Reasoning Tool for Intelligent Data Mining and Analysis for Smart Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lu, Ning; Du, Pengwei; Greitzer, Frank L.

    2012-12-31

    This paper presents the multi-layer, data-driven advanced reasoning tool (M-DART), a proof-of-principle decision support tool for improved power system operation. M-DART will cross-correlate and examine different data sources to assess anomalies, infer root causes, and anneal data into actionable information. By performing higher-level reasoning “triage” of diverse data sources, M-DART focuses on early detection of emerging power system events and identifies highest priority actions for the human decision maker. M-DART represents a significant advancement over today’s grid monitoring technologies that apply offline analyses to derive model-based guidelines for online real-time operations and use isolated data processing mechanisms focusing on individualmore » data domains. The development of the M-DART will bridge these gaps by reasoning about results obtained from multiple data sources that are enabled by the smart grid infrastructure. This hybrid approach integrates a knowledge base that is trained offline but tuned online to capture model-based relationships while revealing complex causal relationships among data from different domains.« less

  15. [Spatial analysis of Oncomelania snail information based on grid data-driven].

    PubMed

    Liu, Gang; Huang, Qiong-Yao; Liu, Yun-Ziang; Wang, Jiang-Tao; Peng, Fei; Liu, Nian-Meng

    2011-06-01

    To explore the relationship between the Oncomelania snail situation and the distance to the water source, soil humidity, vegetation and water level in flood seasons in the islets of Changsha Section of the Xiang River. Combined with the NDVI and soil humidity of islets, the GIS spatial analysis based on grid data-driven was used to analyze the snail situation in Changsha Section of the Xiang River from 2005 to 2009. The relationship between the snail density and the water level in blood seasons was analyzed. In 2005, the snails in Zengpi Islet were mainly distributed at the range of 40-240 m far away from the nearest water source, and the number at the spots with a distance of 60 m was the largest. There was an obvious positive correlation between the snail density and water level in flood seasons. The ranges of the Normalized Difference Vegetation Index and soil humidity of Zengpi Islet in 2005 were 0-0.982 and 0-0.298, respectively and the main vegetation in Changsha Section of the Xiang River were weed and sedge. The map of snail situation by year was drawn according to the standard water level, which reflected the snail situation intuitionistically. By using spatial analysis based on grid data-driven, the situation of vegetation, soil humidity and snail accurately can be reflected, which can help us to understand the endemic situation timely. Even under the circumstance of human intervention, the water level in flood seasons is still an important factor influencing the change of snail situation.

  16. A Physics-driven Neural Networks-based Simulation System (PhyNNeSS) for multimodal interactive virtual environments involving nonlinear deformable objects

    PubMed Central

    De, Suvranu; Deo, Dhannanjay; Sankaranarayanan, Ganesh; Arikatla, Venkata S.

    2012-01-01

    Background While an update rate of 30 Hz is considered adequate for real time graphics, a much higher update rate of about 1 kHz is necessary for haptics. Physics-based modeling of deformable objects, especially when large nonlinear deformations and complex nonlinear material properties are involved, at these very high rates is one of the most challenging tasks in the development of real time simulation systems. While some specialized solutions exist, there is no general solution for arbitrary nonlinearities. Methods In this work we present PhyNNeSS - a Physics-driven Neural Networks-based Simulation System - to address this long-standing technical challenge. The first step is an off-line pre-computation step in which a database is generated by applying carefully prescribed displacements to each node of the finite element models of the deformable objects. In the next step, the data is condensed into a set of coefficients describing neurons of a Radial Basis Function network (RBFN). During real-time computation, these neural networks are used to reconstruct the deformation fields as well as the interaction forces. Results We present realistic simulation examples from interactive surgical simulation with real time force feedback. As an example, we have developed a deformable human stomach model and a Penrose-drain model used in the Fundamentals of Laparoscopic Surgery (FLS) training tool box. Conclusions A unique computational modeling system has been developed that is capable of simulating the response of nonlinear deformable objects in real time. The method distinguishes itself from previous efforts in that a systematic physics-based pre-computational step allows training of neural networks which may be used in real time simulations. We show, through careful error analysis, that the scheme is scalable, with the accuracy being controlled by the number of neurons used in the simulation. PhyNNeSS has been integrated into SoFMIS (Software Framework for Multimodal

  17. The utilization of neural nets in populating an object-oriented database

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Hill, Scott E.; Cromp, Robert F.

    1989-01-01

    Existing NASA supported scientific data bases are usually developed, managed and populated in a tedious, error prone and self-limiting way in terms of what can be described in a relational Data Base Management System (DBMS). The next generation Earth remote sensing platforms (i.e., Earth Observation System, (EOS), will be capable of generating data at a rate of over 300 Mbs per second from a suite of instruments designed for different applications. What is needed is an innovative approach that creates object-oriented databases that segment, characterize, catalog and are manageable in a domain-specific context and whose contents are available interactively and in near-real-time to the user community. Described here is work in progress that utilizes an artificial neural net approach to characterize satellite imagery of undefined objects into high-level data objects. The characterized data is then dynamically allocated to an object-oriented data base where it can be reviewed and assessed by a user. The definition, development, and evolution of the overall data system model are steps in the creation of an application-driven knowledge-based scientific information system.

  18. Data-driven discovery of Koopman eigenfunctions using deep learning

    NASA Astrophysics Data System (ADS)

    Lusch, Bethany; Brunton, Steven L.; Kutz, J. Nathan

    2017-11-01

    Koopman operator theory transforms any autonomous non-linear dynamical system into an infinite-dimensional linear system. Since linear systems are well-understood, a mapping of non-linear dynamics to linear dynamics provides a powerful approach to understanding and controlling fluid flows. However, finding the correct change of variables remains an open challenge. We present a strategy to discover an approximate mapping using deep learning. Our neural networks find this change of variables, its inverse, and a finite-dimensional linear dynamical system defined on the new variables. Our method is completely data-driven and only requires measurements of the system, i.e. it does not require derivatives or knowledge of the governing equations. We find a minimal set of approximate Koopman eigenfunctions that are sufficient to reconstruct and advance the system to future states. We demonstrate the method on several dynamical systems.

  19. Skin cancer screening: recommendations for data-driven screening guidelines and a review of the US Preventive Services Task Force controversy

    PubMed Central

    Johnson, Mariah M; Leachman, Sancy A; Aspinwall, Lisa G; Cranmer, Lee D; Curiel-Lewandrowski, Clara; Sondak, Vernon K; Stemwedel, Clara E; Swetter, Susan M; Vetto, John; Bowles, Tawnya; Dellavalle, Robert P; Geskin, Larisa J; Grossman, Douglas; Grossmann, Kenneth F; Hawkes, Jason E; Jeter, Joanne M; Kim, Caroline C; Kirkwood, John M; Mangold, Aaron R; Meyskens, Frank; Ming, Michael E; Nelson, Kelly C; Piepkorn, Michael; Pollack, Brian P; Robinson, June K; Sober, Arthur J; Trotter, Shannon; Venna, Suraj S; Agarwala, Sanjiv; Alani, Rhoda; Averbook, Bruce; Bar, Anna; Becevic, Mirna; Box, Neil; E Carson, William; Cassidy, Pamela B; Chen, Suephy C; Chu, Emily Y; Ellis, Darrel L; Ferris, Laura K; Fisher, David E; Kendra, Kari; Lawson, David H; Leming, Philip D; Margolin, Kim A; Markovic, Svetomir; Martini, Mary C; Miller, Debbie; Sahni, Debjani; Sharfman, William H; Stein, Jennifer; Stratigos, Alexander J; Tarhini, Ahmad; Taylor, Matthew H; Wisco, Oliver J; Wong, Michael K

    2017-01-01

    Melanoma is usually apparent on the skin and readily detected by trained medical providers using a routine total body skin examination, yet this malignancy is responsible for the majority of skin cancer-related deaths. Currently, there is no national consensus on skin cancer screening in the USA, but dermatologists and primary care providers are routinely confronted with making the decision about when to recommend total body skin examinations and at what interval. The objectives of this paper are: to propose rational, risk-based, data-driven guidelines commensurate with the US Preventive Services Task Force screening guidelines for other disorders; to compare our proposed guidelines to recommendations made by other national and international organizations; and to review the US Preventive Services Task Force's 2016 Draft Recommendation Statement on skin cancer screening. PMID:28758010

  20. Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment

    PubMed Central

    Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis

    2018-01-01

    This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks. PMID:29762505

  1. Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment.

    PubMed

    Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis; Quiroga, Jabid

    2018-05-15

    This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks.

  2. A data set for evaluating the performance of multi-class multi-object video tracking

    NASA Astrophysics Data System (ADS)

    Chakraborty, Avishek; Stamatescu, Victor; Wong, Sebastien C.; Wigley, Grant; Kearney, David

    2017-05-01

    One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5 different classes of object (Car, Bus, Truck, Person, Cyclist) for 24 videos of 871 image frames each. In addition to the object IDs and class labels, the ground truth data also contains the original bounding box coordinates together with new bounding boxes in instances where un-annotated objects were present. The unique IDs are maintained during occlusions between multiple objects or when objects re-enter the field of view. This will provide: a solid foundation for evaluating the performance of multi-object tracking of different types of objects, a straightforward comparison of tracking system performance using the standard Multi Object Tracking (MOT) framework, and classification performance using the Neovision2 metrics. These data have been hosted publically.

  3. Using XML Configuration-Driven Development to Create a Customizable Ground Data System

    NASA Technical Reports Server (NTRS)

    Nash, Brent; DeMore, Martha

    2009-01-01

    The Mission data Processing and Control Subsystem (MPCS) is being developed as a multi-mission Ground Data System with the Mars Science Laboratory (MSL) as the first fully supported mission. MPCS is a fully featured, Java-based Ground Data System (GDS) for telecommand and telemetry processing based on Configuration-Driven Development (CDD). The eXtensible Markup Language (XML) is the ideal language for CDD because it is easily readable and editable by all levels of users and is also backed by a World Wide Web Consortium (W3C) standard and numerous powerful processing tools that make it uniquely flexible. The CDD approach adopted by MPCS minimizes changes to compiled code by using XML to create a series of configuration files that provide both coarse and fine grained control over all aspects of GDS operation.

  4. Accessing and distributing EMBL data using CORBA (common object request broker architecture).

    PubMed

    Wang, L; Rodriguez-Tomé, P; Redaschi, N; McNeil, P; Robinson, A; Lijnzaad, P

    2000-01-01

    The EMBL Nucleotide Sequence Database is a comprehensive database of DNA and RNA sequences and related information traditionally made available in flat-file format. Queries through tools such as SRS (Sequence Retrieval System) also return data in flat-file format. Flat files have a number of shortcomings, however, and the resources therefore currently lack a flexible environment to meet individual researchers' needs. The Object Management Group's common object request broker architecture (CORBA) is an industry standard that provides platform-independent programming interfaces and models for portable distributed object-oriented computing applications. Its independence from programming languages, computing platforms and network protocols makes it attractive for developing new applications for querying and distributing biological data. A CORBA infrastructure developed by EMBL-EBI provides an efficient means of accessing and distributing EMBL data. The EMBL object model is defined such that it provides a basis for specifying interfaces in interface definition language (IDL) and thus for developing the CORBA servers. The mapping from the object model to the relational schema in the underlying Oracle database uses the facilities provided by PersistenceTM, an object/relational tool. The techniques of developing loaders and 'live object caching' with persistent objects achieve a smart live object cache where objects are created on demand. The objects are managed by an evictor pattern mechanism. The CORBA interfaces to the EMBL database address some of the problems of traditional flat-file formats and provide an efficient means for accessing and distributing EMBL data. CORBA also provides a flexible environment for users to develop their applications by building clients to our CORBA servers, which can be integrated into existing systems.

  5. Accessing and distributing EMBL data using CORBA (common object request broker architecture)

    PubMed Central

    Wang, Lichun; Rodriguez-Tomé, Patricia; Redaschi, Nicole; McNeil, Phil; Robinson, Alan; Lijnzaad, Philip

    2000-01-01

    Background: The EMBL Nucleotide Sequence Database is a comprehensive database of DNA and RNA sequences and related information traditionally made available in flat-file format. Queries through tools such as SRS (Sequence Retrieval System) also return data in flat-file format. Flat files have a number of shortcomings, however, and the resources therefore currently lack a flexible environment to meet individual researchers' needs. The Object Management Group's common object request broker architecture (CORBA) is an industry standard that provides platform-independent programming interfaces and models for portable distributed object-oriented computing applications. Its independence from programming languages, computing platforms and network protocols makes it attractive for developing new applications for querying and distributing biological data. Results: A CORBA infrastructure developed by EMBL-EBI provides an efficient means of accessing and distributing EMBL data. The EMBL object model is defined such that it provides a basis for specifying interfaces in interface definition language (IDL) and thus for developing the CORBA servers. The mapping from the object model to the relational schema in the underlying Oracle database uses the facilities provided by PersistenceTM, an object/relational tool. The techniques of developing loaders and 'live object caching' with persistent objects achieve a smart live object cache where objects are created on demand. The objects are managed by an evictor pattern mechanism. Conclusions: The CORBA interfaces to the EMBL database address some of the problems of traditional flat-file formats and provide an efficient means for accessing and distributing EMBL data. CORBA also provides a flexible environment for users to develop their applications by building clients to our CORBA servers, which can be integrated into existing systems. PMID:11178259

  6. Collaborative Data-Driven Decision Making: A Qualitative Study of the Lived Experiences of Primary Grade Classroom Teachers

    ERIC Educational Resources Information Center

    Ralston, Christine R.

    2012-01-01

    The purpose of this qualitative study was to describe the lived experiences of primary classroom teachers participating in collaborative data-driven decision making. Hermeneutic phenomenology served as the theoretical framework. Data were collected by conducting interviews with thirteen classroom teachers who taught in grades kindergarten through…

  7. Methods and apparatus for extraction and tracking of objects from multi-dimensional sequence data

    NASA Technical Reports Server (NTRS)

    Hill, Matthew L. (Inventor); Chang, Yuan-Chi (Inventor); Li, Chung-Sheng (Inventor); Castelli, Vittorio (Inventor); Bergman, Lawrence David (Inventor)

    2008-01-01

    An object tracking technique is provided which, given: (i) a potentially large data set; (ii) a set of dimensions along which the data has been ordered; and (iii) a set of functions for measuring the similarity between data elements, a set of objects are produced. Each of these objects is defined by a list of data elements. Each of the data elements on this list contains the probability that the data element is part of the object. The method produces these lists via an adaptive, knowledge-based search function which directs the search for high-probability data elements. This serves to reduce the number of data element combinations evaluated while preserving the most flexibility in defining the associations of data elements which comprise an object.

  8. Methods and apparatus for extraction and tracking of objects from multi-dimensional sequence data

    NASA Technical Reports Server (NTRS)

    Hill, Matthew L. (Inventor); Chang, Yuan-Chi (Inventor); Li, Chung-Sheng (Inventor); Castelli, Vittorio (Inventor); Bergman, Lawrence David (Inventor)

    2005-01-01

    An object tracking technique is provided which, given: (i) a potentially large data set; (ii) a set of dimensions along which the data has been ordered; and (iii) a set of functions for measuring the similarity between data elements, a set of objects are produced. Each of these objects is defined by a list of data elements. Each of the data elements on this list contains the probability that the data element is part of the object. The method produces these lists via an adaptive, knowledge-based search function which directs the search for high-probability data elements. This serves to reduce the number of data element combinations evaluated while preserving the most flexibility in defining the associations of data elements which comprise an object.

  9. Error determination of a successive correction type objective analysis scheme. [for surface meteorological data

    NASA Technical Reports Server (NTRS)

    Smith, D. R.; Leslie, F. W.

    1984-01-01

    The Purdue Regional Objective Analysis of the Mesoscale (PROAM) is a successive correction type scheme for the analysis of surface meteorological data. The scheme is subjected to a series of experiments to evaluate its performance under a variety of analysis conditions. The tests include use of a known analytic temperature distribution to quantify error bounds for the scheme. Similar experiments were conducted using actual atmospheric data. Results indicate that the multiple pass technique increases the accuracy of the analysis. Furthermore, the tests suggest appropriate values for the analysis parameters in resolving disturbances for the data set used in this investigation.

  10. Prognostics of Power Mosfets Under Thermal Stress Accelerated Aging Using Data-Driven and Model-Based Methodologies

    NASA Technical Reports Server (NTRS)

    Celaya, Jose; Saxena, Abhinav; Saha, Sankalita; Goebel, Kai F.

    2011-01-01

    An approach for predicting remaining useful life of power MOSFETs (metal oxide field effect transistor) devices has been developed. Power MOSFETs are semiconductor switching devices that are instrumental in electronics equipment such as those used in operation and control of modern aircraft and spacecraft. The MOSFETs examined here were aged under thermal overstress in a controlled experiment and continuous performance degradation data were collected from the accelerated aging experiment. Dieattach degradation was determined to be the primary failure mode. The collected run-to-failure data were analyzed and it was revealed that ON-state resistance increased as die-attach degraded under high thermal stresses. Results from finite element simulation analysis support the observations from the experimental data. Data-driven and model based prognostics algorithms were investigated where ON-state resistance was used as the primary precursor of failure feature. A Gaussian process regression algorithm was explored as an example for a data-driven technique and an extended Kalman filter and a particle filter were used as examples for model-based techniques. Both methods were able to provide valid results. Prognostic performance metrics were employed to evaluate and compare the algorithms.

  11. Linear dynamical modes as new variables for data-driven ENSO forecast

    NASA Astrophysics Data System (ADS)

    Gavrilov, Andrey; Seleznev, Aleksei; Mukhin, Dmitry; Loskutov, Evgeny; Feigin, Alexander; Kurths, Juergen

    2018-05-01

    A new data-driven model for analysis and prediction of spatially distributed time series is proposed. The model is based on a linear dynamical mode (LDM) decomposition of the observed data which is derived from a recently developed nonlinear dimensionality reduction approach. The key point of this approach is its ability to take into account simple dynamical properties of the observed system by means of revealing the system's dominant time scales. The LDMs are used as new variables for empirical construction of a nonlinear stochastic evolution operator. The method is applied to the sea surface temperature anomaly field in the tropical belt where the El Nino Southern Oscillation (ENSO) is the main mode of variability. The advantage of LDMs versus traditionally used empirical orthogonal function decomposition is demonstrated for this data. Specifically, it is shown that the new model has a competitive ENSO forecast skill in comparison with the other existing ENSO models.

  12. 34 CFR 364.42 - What objectives and information must be included in the State plan?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 34 Education 2 2010-07-01 2010-07-01 false What objectives and information must be included in the State plan? 364.42 Section 364.42 Education Regulations of the Offices of the Department of Education (Continued) OFFICE OF SPECIAL EDUCATION AND REHABILITATIVE SERVICES, DEPARTMENT OF EDUCATION STATE...

  13. 34 CFR 364.42 - What objectives and information must be included in the State plan?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 34 Education 2 2011-07-01 2010-07-01 true What objectives and information must be included in the State plan? 364.42 Section 364.42 Education Regulations of the Offices of the Department of Education (Continued) OFFICE OF SPECIAL EDUCATION AND REHABILITATIVE SERVICES, DEPARTMENT OF EDUCATION STATE...

  14. Water Resources Data for Illinois - Water Year 2005 (Includes Historical Data)

    USGS Publications Warehouse

    LaTour, J.K.; Weldon, E.A.; Dupre, D.H.; Halfar, T.M.

    2006-01-01

    This annual Water-Data Report for Illinois contains current water year (Oct. 1, 2004, to Sept. 30, 2005) and historical data of discharge, stage, water quality and biology of streams; stage of lakes and reservoirs; levels and quality of ground water; and records of precipitation, air temperature, dew point, solar radiation, and wind speed. The current year's (2005) data provided in this report include (1) discharge for 182 surface-water gaging stations and for 9 crest-stage partial-record stations; (2) stage for 33 surface-water gaging stations; (3) water-quality records for 10 surface-water stations; (4) sediment-discharge records for 14 surface-water stations; (5) water-level records for 98 ground-water wells; (6) water-quality records for 17 ground-water wells; (7) precipitation records for 48 rain gages; (8) records of air temperature, dew point, solar radiation and wind speed for 1 meteorological station; and (9) biological records for 6 sample sites. Also included are miscellaneous data collected at various sites not in the systematic data-collection network. Data were collected and compiled as a part of the National Water Information System (NWIS) maintained by the U.S. Geological Survey in cooperation with Federal, State, and local agencies.

  15. Organizing and Typing Persistent Objects Within an Object-Oriented Framework

    NASA Technical Reports Server (NTRS)

    Madany, Peter W.; Campbell, Roy H.

    1991-01-01

    Conventional operating systems provide little or no direct support for the services required for an efficient persistent object system implementation. We have built a persistent object scheme using a customization and extension of an object-oriented operating system called Choices. Choices includes a framework for the storage of persistent data that is suited to the construction of both conventional file system and persistent object system. In this paper we describe three areas in which persistent object support differs from file system support: storage organization, storage management, and typing. Persistent object systems must support various sizes of objects efficiently. Customizable containers, which are themselves persistent objects and can be nested, support a wide range of object sizes in Choices. Collections of persistent objects that are accessed as an aggregate and collections of light-weight persistent objects can be clustered in containers that are nested within containers for larger objects. Automated garbage collection schemes are added to storage management and have a major impact on persistent object applications. The Choices persistent object store provides extensible sets of persistent object types. The store contains not only the data for persistent objects but also the names of the classes to which they belong and the code for the operation of the classes. Besides presenting persistent object storage organization, storage management, and typing, this paper discusses how persistent objects are named and used within the Choices persistent data/file system framework.

  16. How State Education Agencies in the Northeast and Islands Region Support Data-Driven Decisionmaking in Districts and Schools. Issues & Answers. REL 2009-No. 072

    ERIC Educational Resources Information Center

    LaPointe, Michelle A.; Brett, Jessica; Kagle, Melissa; Midouhas, Emily; Sanchez, Maria Teresa

    2009-01-01

    The report examines the initiatives of state education agencies in the Northeast and Islands Region to support data-driven decisionmaking in districts and schools and describes the service providers hired to support this work. Four components of data-driven decisionmaking initiatives are identified: (1) Centralized data system/warehouse; (2) Tools…

  17. Object-based detection of vehicles using combined optical and elevation data

    NASA Astrophysics Data System (ADS)

    Schilling, Hendrik; Bulatov, Dimitri; Middelmann, Wolfgang

    2018-02-01

    The detection of vehicles is an important and challenging topic that is relevant for many applications. In this work, we present a workflow that utilizes optical and elevation data to detect vehicles in remotely sensed urban data. This workflow consists of three consecutive stages: candidate identification, classification, and single vehicle extraction. Unlike in most previous approaches, fusion of both data sources is strongly pursued at all stages. While the first stage utilizes the fact that most man-made objects are rectangular in shape, the second and third stages employ machine learning techniques combined with specific features. The stages are designed to handle multiple sensor input, which results in a significant improvement. A detailed evaluation shows the benefits of our workflow, which includes hand-tailored features; even in comparison with classification approaches based on Convolutional Neural Networks, which are state of the art in computer vision, we could obtain a comparable or superior performance (F1 score of 0.96-0.94).

  18. Asynchronous Object Storage with QoS for Scientific and Commercial Big Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brim, Michael J; Dillow, David A; Oral, H Sarp

    2013-01-01

    This paper presents our design for an asynchronous object storage system intended for use in scientific and commercial big data workloads. Use cases from the target workload do- mains are used to motivate the key abstractions used in the application programming interface (API). The architecture of the Scalable Object Store (SOS), a prototype object stor- age system that supports the API s facilities, is presented. The SOS serves as a vehicle for future research into scalable and resilient big data object storage. We briefly review our research into providing efficient storage servers capable of providing quality of service (QoS) contractsmore » relevant for big data use cases.« less

  19. Cognitive load privileges memory-based over data-driven processing, not group-level over person-level processing.

    PubMed

    Skorich, Daniel P; Mavor, Kenneth I

    2013-09-01

    In the current paper, we argue that categorization and individuation, as traditionally discussed and as experimentally operationalized, are defined in terms of two confounded underlying dimensions: a person/group dimension and a memory-based/data-driven dimension. In a series of three experiments, we unconfound these dimensions and impose a cognitive load. Across the three experiments, two with laboratory-created targets and one with participants' friends as the target, we demonstrate that cognitive load privileges memory-based over data-driven processing, not group- over person-level processing. We discuss the results in terms of their implications for conceptualizations of the categorization/individuation distinction, for the equivalence of person and group processes, for the ultimate 'purpose' and meaningfulness of group-based perception and, fundamentally, for the process of categorization, broadly defined. © 2012 The British Psychological Society.

  20. Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation.

    PubMed

    Wang, Shuo; Zhou, Mu; Liu, Zaiyi; Liu, Zhenyu; Gu, Dongsheng; Zang, Yali; Dong, Di; Gevaert, Olivier; Tian, Jie

    2017-08-01

    Accurate lung nodule segmentation from computed tomography (CT) images is of great importance for image-driven lung cancer analysis. However, the heterogeneity of lung nodules and the presence of similar visual characteristics between nodules and their surroundings make it difficult for robust nodule segmentation. In this study, we propose a data-driven model, termed the Central Focused Convolutional Neural Networks (CF-CNN), to segment lung nodules from heterogeneous CT images. Our approach combines two key insights: 1) the proposed model captures a diverse set of nodule-sensitive features from both 3-D and 2-D CT images simultaneously; 2) when classifying an image voxel, the effects of its neighbor voxels can vary according to their spatial locations. We describe this phenomenon by proposing a novel central pooling layer retaining much information on voxel patch center, followed by a multi-scale patch learning strategy. Moreover, we design a weighted sampling to facilitate the model training, where training samples are selected according to their degree of segmentation difficulty. The proposed method has been extensively evaluated on the public LIDC dataset including 893 nodules and an independent dataset with 74 nodules from Guangdong General Hospital (GDGH). We showed that CF-CNN achieved superior segmentation performance with average dice scores of 82.15% and 80.02% for the two datasets respectively. Moreover, we compared our results with the inter-radiologists consistency on LIDC dataset, showing a difference in average dice score of only 1.98%. Copyright © 2017. Published by Elsevier B.V.

  1. Space Object Maneuver Detection Algorithms Using TLE Data

    NASA Astrophysics Data System (ADS)

    Pittelkau, M.

    2016-09-01

    An important aspect of Space Situational Awareness (SSA) is detection of deliberate and accidental orbit changes of space objects. Although space surveillance systems detect orbit maneuvers within their tracking algorithms, maneuver data are not readily disseminated for general use. However, two-line element (TLE) data is available and can be used to detect maneuvers of space objects. This work is an attempt to improve upon existing TLE-based maneuver detection algorithms. Three adaptive maneuver detection algorithms are developed and evaluated: The first is a fading-memory Kalman filter, which is equivalent to the sliding-window least-squares polynomial fit, but computationally more efficient and adaptive to the noise in the TLE data. The second algorithm is based on a sample cumulative distribution function (CDF) computed from a histogram of the magnitude-squared |V|2 of change-in-velocity vectors (V), which is computed from the TLE data. A maneuver detection threshold is computed from the median estimated from the CDF, or from the CDF and a specified probability of false alarm. The third algorithm is a median filter. The median filter is the simplest of a class of nonlinear filters called order statistics filters, which is within the theory of robust statistics. The output of the median filter is practically insensitive to outliers, or large maneuvers. The median of the |V|2 data is proportional to the variance of the V, so the variance is estimated from the output of the median filter. A maneuver is detected when the input data exceeds a constant times the estimated variance.

  2. Double shell tanks (DST) chemistry control data quality objectives

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    BANNING, D.L.

    2001-10-09

    One of the main functions of the River Protection Project is to store the Hanford Site tank waste until the Waste Treatment Plant (WTP) is ready to receive and process the waste. Waste from the older single-shell tanks is being transferred to the newer double-shell tanks (DSTs). Therefore, the integrity of the DSTs must be maintained until the waste from all tanks has been retrieved and transferred to the WTP. To help maintain the integrity of the DSTs over the life of the project, specific chemistry limits have been established to control corrosion of the DSTs. These waste chemistry limitsmore » are presented in the Technical Safety Requirements (TSR) document HNF-SD-WM-TSR-006, Sec. 5 . IS, Rev 2B (CHG 200 I). In order to control the chemistry in the DSTs, the Chemistry Control Program will require analyses of the tank waste. This document describes the Data Quality Objective (DUO) process undertaken to ensure appropriate data will be collected to control the waste chemistry in the DSTs. The DQO process was implemented in accordance with Data Quality Objectives for Sampling and Analyses, HNF-IP-0842, Rev. Ib, Vol. IV, Section 4.16, (Banning 2001) and the U.S. Environmental Protection Agency EPA QA/G4, Guidance for the Data Quality Objectives Process (EPA 1994), with some modifications to accommodate project or tank specific requirements and constraints.« less

  3. Detecting Inspection Objects of Power Line from Cable Inspection Robot LiDAR Data

    PubMed Central

    Qin, Xinyan; Wu, Gongping; Fan, Fei

    2018-01-01

    Power lines are extending to complex environments (e.g., lakes and forests), and the distribution of power lines in a tower is becoming complicated (e.g., multi-loop and multi-bundle). Additionally, power line inspection is becoming heavier and more difficult. Advanced LiDAR technology is increasingly being used to solve these difficulties. Based on precise cable inspection robot (CIR) LiDAR data and the distinctive position and orientation system (POS) data, we propose a novel methodology to detect inspection objects surrounding power lines. The proposed method mainly includes four steps: firstly, the original point cloud is divided into single-span data as a processing unit; secondly, the optimal elevation threshold is constructed to remove ground points without the existing filtering algorithm, improving data processing efficiency and extraction accuracy; thirdly, a single power line and its surrounding data can be respectively extracted by a structured partition based on a POS data (SPPD) algorithm from “layer” to “block” according to power line distribution; finally, a partition recognition method is proposed based on the distribution characteristics of inspection objects, highlighting the feature information and improving the recognition effect. The local neighborhood statistics and the 3D region growing method are used to recognize different inspection objects surrounding power lines in a partition. Three datasets were collected by two CIR LIDAR systems in our study. The experimental results demonstrate that an average 90.6% accuracy and average 98.2% precision at the point cloud level can be achieved. The successful extraction indicates that the proposed method is feasible and promising. Our study can be used to obtain precise dimensions of fittings for modeling, as well as automatic detection and location of security risks, so as to improve the intelligence level of power line inspection. PMID:29690560

  4. Detecting Inspection Objects of Power Line from Cable Inspection Robot LiDAR Data.

    PubMed

    Qin, Xinyan; Wu, Gongping; Lei, Jin; Fan, Fei; Ye, Xuhui

    2018-04-22

    Power lines are extending to complex environments (e.g., lakes and forests), and the distribution of power lines in a tower is becoming complicated (e.g., multi-loop and multi-bundle). Additionally, power line inspection is becoming heavier and more difficult. Advanced LiDAR technology is increasingly being used to solve these difficulties. Based on precise cable inspection robot (CIR) LiDAR data and the distinctive position and orientation system (POS) data, we propose a novel methodology to detect inspection objects surrounding power lines. The proposed method mainly includes four steps: firstly, the original point cloud is divided into single-span data as a processing unit; secondly, the optimal elevation threshold is constructed to remove ground points without the existing filtering algorithm, improving data processing efficiency and extraction accuracy; thirdly, a single power line and its surrounding data can be respectively extracted by a structured partition based on a POS data (SPPD) algorithm from "layer" to "block" according to power line distribution; finally, a partition recognition method is proposed based on the distribution characteristics of inspection objects, highlighting the feature information and improving the recognition effect. The local neighborhood statistics and the 3D region growing method are used to recognize different inspection objects surrounding power lines in a partition. Three datasets were collected by two CIR LIDAR systems in our study. The experimental results demonstrate that an average 90.6% accuracy and average 98.2% precision at the point cloud level can be achieved. The successful extraction indicates that the proposed method is feasible and promising. Our study can be used to obtain precise dimensions of fittings for modeling, as well as automatic detection and location of security risks, so as to improve the intelligence level of power line inspection.

  5. Assessing the Potential for Sediment Gravity-Driven Underflows at the Currently Active Mouth of the Huanghe Delta

    NASA Astrophysics Data System (ADS)

    Mullane, M.; Kumpf, L. L.; Kineke, G. C.

    2017-12-01

    The Huanghe (Yellow River), once known for extremely high suspended-sediment concentrations (SSCs) that could produce hyperpycnal plumes (10s of g/l), has experienced a dramatic reduction in sediment load following the construction of several reservoirs, namely the Xiaolangdi reservoir completed in 1999. Except for managed flushing events, SSC in the lower river is now on the order of 1 g/l or less. Adaptations of the Chezy equation for gravity-driven transport show that dominant parameters driving hyperpycnal underflows include concentration (and therefore density), thickness of a sediment-laden layer and bed slope. The objectives of this research were to assess the potential for gravity-driven underflows given modern conditions at the active river mouth. Multiple shore-normal transects were conducted during research cruises in mid-July of 2016 and 2017 using a Knudsen dual-frequency echosounder to collect bathymetric data and to document the potential presence of fluid mud layers. An instrumented profiling tripod equipped with a CTD, optical backscatterance sensor and in-situ pump system were used to sample water column parameters. SSCs were determined from near-bottom and surface water samples. Echosounder data were analyzed for bed slopes at the delta-front and differences in depth of return for the two frequencies (50 and 200 kHz), which could indicate fluid muds. Bathymetric data analysis yielded bed slope measurements near or above threshold values to produce gravity-driven underflows (0.46°). The maximum observed thickness of a potential fluid mud layer was 0.7 m, and the highest sampled near-bed SSCs were nearly 14 g/l for both field campaigns. These results indicate that the modern delta maintains potential for sediment gravity-driven underflows, even during ambient conditions prior to maximum summer discharge. These results will inform future work quantitatively comparing the contributions of all sediment dispersal mechanisms near the active Huanghe

  6. Electron Driven Processes in Atmospheric Behaviour

    NASA Astrophysics Data System (ADS)

    Campbell, L.; Brunger, M. J.; Teubner, P. J. O.

    2006-11-01

    Electron impact plays an important role in many atmospheric processes. Calculation of these is important for basic understanding, atmospheric modeling and remote sensing. Accurate atomic and molecular data, including electron impact cross sections, are required for such calculations. Five electron-driven processes are considered: auroral and dayglow emissions, the reduction of atmospheric electron density by vibrationally excited N2, NO production and infrared emission from NO. In most cases the predictions are compared with measurements. The dependence on experimental atomic and molecular data is also investigated.

  7. Data-driven CT protocol review and management—experience from a large academic hospital.

    PubMed

    Zhang, Da; Savage, Cristy A; Li, Xinhua; Liu, Bob

    2015-03-01

    Protocol review plays a critical role in CT quality assurance, but large numbers of protocols and inconsistent protocol names on scanners and in exam records make thorough protocol review formidable. In this investigation, we report on a data-driven cataloging process that can be used to assist in the reviewing and management of CT protocols. We collected lists of scanner protocols, as well as 18 months of recent exam records, for 10 clinical scanners. We developed computer algorithms to automatically deconstruct the protocol names on the scanner and in the exam records into core names and descriptive components. Based on the core names, we were able to group the scanner protocols into a much smaller set of "core protocols," and to easily link exam records with the scanner protocols. We calculated the percentage of usage for each core protocol, from which the most heavily used protocols were identified. From the percentage-of-usage data, we found that, on average, 18, 33, and 49 core protocols per scanner covered 80%, 90%, and 95%, respectively, of all exams. These numbers are one order of magnitude smaller than the typical numbers of protocols that are loaded on a scanner (200-300, as reported in the literature). Duplicated, outdated, and rarely used protocols on the scanners were easily pinpointed in the cataloging process. The data-driven cataloging process can facilitate the task of protocol review. Copyright © 2015 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  8. The subjective experience of object recognition: comparing metacognition for object detection and object categorization.

    PubMed

    Meuwese, Julia D I; van Loon, Anouk M; Lamme, Victor A F; Fahrenfort, Johannes J

    2014-05-01

    Perceptual decisions seem to be made automatically and almost instantly. Constructing a unitary subjective conscious experience takes more time. For example, when trying to avoid a collision with a car on a foggy road you brake or steer away in a reflex, before realizing you were in a near accident. This subjective aspect of object recognition has been given little attention. We used metacognition (assessed with confidence ratings) to measure subjective experience during object detection and object categorization for degraded and masked objects, while objective performance was matched. Metacognition was equal for degraded and masked objects, but categorization led to higher metacognition than did detection. This effect turned out to be driven by a difference in metacognition for correct rejection trials, which seemed to be caused by an asymmetry of the distractor stimulus: It does not contain object-related information in the detection task, whereas it does contain such information in the categorization task. Strikingly, this asymmetry selectively impacted metacognitive ability when objective performance was matched. This finding reveals a fundamental difference in how humans reflect versus act on information: When matching the amount of information required to perform two tasks at some objective level of accuracy (acting), metacognitive ability (reflecting) is still better in tasks that rely on positive evidence (categorization) than in tasks that rely more strongly on an absence of evidence (detection).

  9. Data-driven RANS for simulations of large wind farms

    NASA Astrophysics Data System (ADS)

    Iungo, G. V.; Viola, F.; Ciri, U.; Rotea, M. A.; Leonardi, S.

    2015-06-01

    In the wind energy industry there is a growing need for real-time predictions of wind turbine wake flows in order to optimize power plant control and inhibit detrimental wake interactions. To this aim, a data-driven RANS approach is proposed in order to achieve very low computational costs and adequate accuracy through the data assimilation procedure. The RANS simulations are implemented with a classical Boussinesq hypothesis and a mixing length turbulence closure model, which is calibrated through the available data. High-fidelity LES simulations of a utility-scale wind turbine operating with different tip speed ratios are used as database. It is shown that the mixing length model for the RANS simulations can be calibrated accurately through the Reynolds stress of the axial and radial velocity components, and the gradient of the axial velocity in the radial direction. It is found that the mixing length is roughly invariant in the very near wake, then it increases linearly with the downstream distance in the diffusive region. The variation rate of the mixing length in the downstream direction is proposed as a criterion to detect the transition between near wake and transition region of a wind turbine wake. Finally, RANS simulations were performed with the calibrated mixing length model, and a good agreement with the LES simulations is observed.

  10. Optimal information networks: Application for data-driven integrated health in populations

    PubMed Central

    Servadio, Joseph L.; Convertino, Matteo

    2018-01-01

    Development of composite indicators for integrated health in populations typically relies on a priori assumptions rather than model-free, data-driven evidence. Traditional variable selection processes tend not to consider relatedness and redundancy among variables, instead considering only individual correlations. In addition, a unified method for assessing integrated health statuses of populations is lacking, making systematic comparison among populations impossible. We propose the use of maximum entropy networks (MENets) that use transfer entropy to assess interrelatedness among selected variables considered for inclusion in a composite indicator. We also define optimal information networks (OINs) that are scale-invariant MENets, which use the information in constructed networks for optimal decision-making. Health outcome data from multiple cities in the United States are applied to this method to create a systemic health indicator, representing integrated health in a city. PMID:29423440

  11. CEREF: A hybrid data-driven model for forecasting annual streamflow from a socio-hydrological system

    NASA Astrophysics Data System (ADS)

    Zhang, Hongbo; Singh, Vijay P.; Wang, Bin; Yu, Yinghao

    2016-09-01

    Hydrological forecasting is complicated by flow regime alterations in a coupled socio-hydrologic system, encountering increasingly non-stationary, nonlinear and irregular changes, which make decision support difficult for future water resources management. Currently, many hybrid data-driven models, based on the decomposition-prediction-reconstruction principle, have been developed to improve the ability to make predictions of annual streamflow. However, there exist many problems that require further investigation, the chief among which is the direction of trend components decomposed from annual streamflow series and is always difficult to ascertain. In this paper, a hybrid data-driven model was proposed to capture this issue, which combined empirical mode decomposition (EMD), radial basis function neural networks (RBFNN), and external forces (EF) variable, also called the CEREF model. The hybrid model employed EMD for decomposition and RBFNN for intrinsic mode function (IMF) forecasting, and determined future trend component directions by regression with EF as basin water demand representing the social component in the socio-hydrologic system. The Wuding River basin was considered for the case study, and two standard statistical measures, root mean squared error (RMSE) and mean absolute error (MAE), were used to evaluate the performance of CEREF model and compare with other models: the autoregressive (AR), RBFNN and EMD-RBFNN. Results indicated that the CEREF model had lower RMSE and MAE statistics, 42.8% and 7.6%, respectively, than did other models, and provided a superior alternative for forecasting annual runoff in the Wuding River basin. Moreover, the CEREF model can enlarge the effective intervals of streamflow forecasting compared to the EMD-RBFNN model by introducing the water demand planned by the government department to improve long-term prediction accuracy. In addition, we considered the high-frequency component, a frequent subject of concern in EMD

  12. Enhancing Transparency and Control When Drawing Data-Driven Inferences About Individuals.

    PubMed

    Chen, Daizhuo; Fraiberger, Samuel P; Moakler, Robert; Provost, Foster

    2017-09-01

    Recent studies show the remarkable power of fine-grained information disclosed by users on social network sites to infer users' personal characteristics via predictive modeling. Similar fine-grained data are being used successfully in other commercial applications. In response, attention is turning increasingly to the transparency that organizations provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. In this article, we focus on inferences about personal characteristics based on information disclosed by users' online actions. As a use case, we explore personal inferences that are made possible from "Likes" on Facebook. We first present a means for providing transparency into the information responsible for inferences drawn by data-driven models. We then introduce the "cloaking device"-a mechanism for users to inhibit the use of particular pieces of information in inference. Using these analytical tools we ask two main questions: (1) How much information must users cloak to significantly affect inferences about their personal traits? We find that usually users must cloak only a small portion of their actions to inhibit inference. We also find that, encouragingly, false-positive inferences are significantly easier to cloak than true-positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. We demonstrate a simple modeling change that requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make control easier or harder for their users.

  13. Smart campus: Data on energy consumption in an ICT-driven university.

    PubMed

    Popoola, Segun I; Atayero, Aderemi A; Okanlawon, Theresa T; Omopariola, Benson I; Takpor, Olusegun A

    2018-02-01

    In this data article, we present a comprehensive dataset on electrical energy consumption in a university that is practically driven by Information and Communication Technologies (ICTs). The total amount of electricity consumed at Covenant University, Ota, Nigeria was measured, monitored, and recorded on daily basis for a period of 12 consecutive months (January-December, 2016). Energy readings were observed from the digital energy meter (EDMI Mk10E) located at the distribution substation that supplies electricity to the university community. The complete energy data are clearly presented in tables and graphs for relevant utility and potential reuse. Also, descriptive first-order statistical analyses of the energy data are provided in this data article. For each month, the histogram distribution and time series plot of the monthly energy consumption data are analyzed to show insightful trends of energy consumption in the university. Furthermore, data on the significant differences in the means of daily energy consumption are made available as obtained from one-way Analysis of Variance (ANOVA) and multiple comparison post-hoc tests. The information provided in this data article will foster research development in the areas of energy efficiency, planning, policy formulation, and management towards the realization of smart campuses.

  14. Which Variables Associated with Data-Driven Instruction Are Believed to Best Predict Urban Student Achievement?

    ERIC Educational Resources Information Center

    Greer, Wil

    2013-01-01

    This study identified the variables associated with data-driven instruction (DDI) that are perceived to best predict student achievement. Of the DDI variables discussed in the literature, 51 of them had a sufficient enough research base to warrant statistical analysis. Of them, 26 were statistically significant. Multiple regression and an…

  15. Radiation Pressure-Driven Magnetic Disk Winds in Broad Absorption Line Quasi-Stellar Objects

    NASA Technical Reports Server (NTRS)

    DeKool, Martin; Begelman, Mitchell C.

    1995-01-01

    We explore a model in which QSO broad absorption lines (BALS) are formed in a radiation pressure-driven wind emerging from a magnetized accretion disk. The magnetic field threading the disk material is dragged by the flow and is compressed by the radiation pressure until it is dynamically important and strong enough to contribute to the confinement of the BAL clouds. We construct a simple self-similar model for such radiatively driven magnetized disk winds, in order to explore their properties. It is found that solutions exist for which the entire magnetized flow is confined to a thin wedge over the surface of the disk. For reasonable values of the mass-loss rate, a typical magnetic field strength such that the magnetic pressure is comparable to the inferred gas pressure in BAL clouds, and a moderate amount of internal soft X-ray absorption, we find that the opening angle of the flow is approximately 0.1 rad, in good agreement with the observed covering factor of the broad absorption line region.

  16. [Rationalities of knowledge production: on transformations of objects, technologies and information in biomedicine and the life sciences].

    PubMed

    Paul, Norbert W

    2009-09-01

    Since decades, scientific change has been interpreted in the light of paradigm shifts and scientific revolutions. The Kuhnian interpretation of scientific change however is now more and more confronted with non-disciplinary thinking in both, science and studies on science. This paper explores how research in biomedicine and the life sciences can be characterized by different rationalities, sometimes converging, sometimes contradictory, all present at the same time with varying ways of influence, impact, and visibility. In general, the rationality of objects is generated by fitting new objects and findings into a new experimental context. The rationality of hypotheses is a move towards the construction of novel explanatory tools and models. This is often inseparable meshing with the third, the technological rationality, in which a technology-driven, self-supporting and sometimes self-referential refinement of methods and technologies comes along with an extension into other fields. During the second and the third phase, the new and emerging fields tend to expand their explanatory reach not only across disciplinary boundaries but also into the social sphere, creating what has been characterized as "exceptionalism" (e.g. genetic exceptionalism or neuro-exceptionalism). Finally, recent biomedicine and life-sciences reach a level in which experimental work becomes more and more data-driven because the technologically constructed experimental systems generate a plethora of findings (data) which at some point start to blur the original hypotheses. For the rationality of information the materiality of research practices becomes secondary and research objects are more and more getting out of sight. Finally, the credibility of science as a practice becomes more and more dependent on consensus about the applicability and relevance of its results. The rationality of interest (and accountability) has become more and more characteristic for a research process which is no longer

  17. Digital Equivalent Data System for XRF Labeling of Objects

    NASA Technical Reports Server (NTRS)

    Schramm, Harry F.; Kaiser, Bruce

    2005-01-01

    A digital equivalent data system (DEDS) is a system for identifying objects by means of the x-ray fluorescence (XRF) spectra of labeling elements that are encased in or deposited on the objects. As such, a DEDS is a revolutionary new major subsystem of an XRF system. A DEDS embodies the means for converting the spectral data output of an XRF scanner to an ASCII alphanumeric or barcode label that can be used to identify (or verify the assumed or apparent identity of) an XRF-scanned object. A typical XRF spectrum of interest contains peaks at photon energies associated with specific elements on the Periodic Table (see figure). The height of each spectral peak above the local background spectral intensity is proportional to the relative abundance of the corresponding element. Alphanumeric values are assigned to the relative abundances of the elements. Hence, if an object contained labeling elements in suitably chosen proportions, an alphanumeric representation of the object could be extracted from its XRF spectrum. The mixture of labeling elements and for reading the XRF spectrum would be compatible with one of the labeling conventions now used for bar codes and binary matrix patterns (essentially, two-dimensional bar codes that resemble checkerboards). A further benefit of such compatibility is that it would enable the conversion of the XRF spectral output to a bar or matrix-coded label, if needed. In short, a process previously used only for material composition analysis has been reapplied to the world of identification. This new level of verification is now being used for "authentication."

  18. Public Data Set: Non-inductively Driven Tokamak Plasmas at Near-Unity βt in the Pegasus Toroidal Experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reusch, Joshua A.; Bodner, Grant M.; Bongard, Michael W.

    This public data set contains openly-documented, machine readable digital research data corresponding to figures published in J.A. Reusch et al., 'Non-inductively Driven Tokamak Plasmas at Near-Unity βt in the Pegasus Toroidal Experiment,' Phys. Plasmas 25, 056101 (2018).

  19. Space Objects Maneuvering Detection and Prediction via Inverse Reinforcement Learning

    NASA Astrophysics Data System (ADS)

    Linares, R.; Furfaro, R.

    This paper determines the behavior of Space Objects (SOs) using inverse Reinforcement Learning (RL) to estimate the reward function that each SO is using for control. The approach discussed in this work can be used to analyze maneuvering of SOs from observational data. The inverse RL problem is solved using the Feature Matching approach. This approach determines the optimal reward function that a SO is using while maneuvering by assuming that the observed trajectories are optimal with respect to the SO's own reward function. This paper uses estimated orbital elements data to determine the behavior of SOs in a data-driven fashion.

  20. A DATA-DRIVEN MODEL FOR SPECTRA: FINDING DOUBLE REDSHIFTS IN THE SLOAN DIGITAL SKY SURVEY

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tsalmantza, P.; Hogg, David W., E-mail: vivitsal@mpia.de

    2012-07-10

    We present a data-driven method-heteroscedastic matrix factorization, a kind of probabilistic factor analysis-for modeling or performing dimensionality reduction on observed spectra or other high-dimensional data with known but non-uniform observational uncertainties. The method uses an iterative inverse-variance-weighted least-squares minimization procedure to generate a best set of basis functions. The method is similar to principal components analysis (PCA), but with the substantial advantage that it uses measurement uncertainties in a responsible way and accounts naturally for poorly measured and missing data; it models the variance in the noise-deconvolved data space. A regularization can be applied, in the form of a smoothnessmore » prior (inspired by Gaussian processes) or a non-negative constraint, without making the method prohibitively slow. Because the method optimizes a justified scalar (related to the likelihood), the basis provides a better fit to the data in a probabilistic sense than any PCA basis. We test the method on Sloan Digital Sky Survey (SDSS) spectra, concentrating on spectra known to contain two redshift components: these are spectra of gravitational lens candidates and massive black hole binaries. We apply a hypothesis test to compare one-redshift and two-redshift models for these spectra, utilizing the data-driven model trained on a random subset of all SDSS spectra. This test confirms 129 of the 131 lens candidates in our sample and all of the known binary candidates, and turns up very few false positives.« less

  1. Differential Cloud Particles Evolution Algorithm Based on Data-Driven Mechanism for Applications of ANN

    PubMed Central

    2017-01-01

    Computational scientists have designed many useful algorithms by exploring a biological process or imitating natural evolution. These algorithms can be used to solve engineering optimization problems. Inspired by the change of matter state, we proposed a novel optimization algorithm called differential cloud particles evolution algorithm based on data-driven mechanism (CPDD). In the proposed algorithm, the optimization process is divided into two stages, namely, fluid stage and solid stage. The algorithm carries out the strategy of integrating global exploration with local exploitation in fluid stage. Furthermore, local exploitation is carried out mainly in solid stage. The quality of the solution and the efficiency of the search are influenced greatly by the control parameters. Therefore, the data-driven mechanism is designed for obtaining better control parameters to ensure good performance on numerical benchmark problems. In order to verify the effectiveness of CPDD, numerical experiments are carried out on all the CEC2014 contest benchmark functions. Finally, two application problems of artificial neural network are examined. The experimental results show that CPDD is competitive with respect to other eight state-of-the-art intelligent optimization algorithms. PMID:28761438

  2. The fiber walk: a model of tip-driven growth with lateral expansion.

    PubMed

    Bucksch, Alexander; Turk, Greg; Weitz, Joshua S

    2014-01-01

    Tip-driven growth processes underlie the development of many plants. To date, tip-driven growth processes have been modeled as an elongating path or series of segments, without taking into account lateral expansion during elongation. Instead, models of growth often introduce an explicit thickness by expanding the area around the completed elongated path. Modeling expansion in this way can lead to contradictions in the physical plausibility of the resulting surface and to uncertainty about how the object reached certain regions of space. Here, we introduce fiber walks as a self-avoiding random walk model for tip-driven growth processes that includes lateral expansion. In 2D, the fiber walk takes place on a square lattice and the space occupied by the fiber is modeled as a lateral contraction of the lattice. This contraction influences the possible subsequent steps of the fiber walk. The boundary of the area consumed by the contraction is derived as the dual of the lattice faces adjacent to the fiber. We show that fiber walks generate fibers that have well-defined curvatures, and thus enable the identification of the process underlying the occupancy of physical space. Hence, fiber walks provide a base from which to model both the extension and expansion of physical biological objects with finite thickness.

  3. The Fiber Walk: A Model of Tip-Driven Growth with Lateral Expansion

    PubMed Central

    Bucksch, Alexander; Turk, Greg; Weitz, Joshua S.

    2014-01-01

    Tip-driven growth processes underlie the development of many plants. To date, tip-driven growth processes have been modeled as an elongating path or series of segments, without taking into account lateral expansion during elongation. Instead, models of growth often introduce an explicit thickness by expanding the area around the completed elongated path. Modeling expansion in this way can lead to contradictions in the physical plausibility of the resulting surface and to uncertainty about how the object reached certain regions of space. Here, we introduce fiber walks as a self-avoiding random walk model for tip-driven growth processes that includes lateral expansion. In 2D, the fiber walk takes place on a square lattice and the space occupied by the fiber is modeled as a lateral contraction of the lattice. This contraction influences the possible subsequent steps of the fiber walk. The boundary of the area consumed by the contraction is derived as the dual of the lattice faces adjacent to the fiber. We show that fiber walks generate fibers that have well-defined curvatures, and thus enable the identification of the process underlying the occupancy of physical space. Hence, fiber walks provide a base from which to model both the extension and expansion of physical biological objects with finite thickness. PMID:24465607

  4. Simulation analysis of photometric data for attitude estimation of unresolved space objects

    NASA Astrophysics Data System (ADS)

    Du, Xiaoping; Gou, Ruixin; Liu, Hao; Hu, Heng; Wang, Yang

    2017-10-01

    The attitude information acquisition of unresolved space objects, such as micro-nano satellites and GEO objects under the way of ground-based optical observations, is a challenge to space surveillance. In this paper, a useful method is proposed to estimate the SO attitude state according to the simulation analysis of photometric data in different attitude states. The object shape model was established and the parameters of the BRDF model were determined, then the space object photometric model was established. Furthermore, the photometric data of space objects in different states are analyzed by simulation and the regular characteristics of the photometric curves are summarized. The simulation results show that the photometric characteristics are useful for attitude inversion in a unique way. Thus, a new idea is provided for space object identification in this paper.

  5. Agile data management for curation of genomes to watershed datasets

    NASA Astrophysics Data System (ADS)

    Varadharajan, C.; Agarwal, D.; Faybishenko, B.; Versteeg, R.

    2015-12-01

    A software platform is being developed for data management and assimilation [DMA] as part of the U.S. Department of Energy's Genomes to Watershed Sustainable Systems Science Focus Area 2.0. The DMA components and capabilities are driven by the project science priorities and the development is based on agile development techniques. The goal of the DMA software platform is to enable users to integrate and synthesize diverse and disparate field, laboratory, and simulation datasets, including geological, geochemical, geophysical, microbiological, hydrological, and meteorological data across a range of spatial and temporal scales. The DMA objectives are (a) developing an integrated interface to the datasets, (b) storing field monitoring data, laboratory analytical results of water and sediments samples collected into a database, (c) providing automated QA/QC analysis of data and (d) working with data providers to modify high-priority field and laboratory data collection and reporting procedures as needed. The first three objectives are driven by user needs, while the last objective is driven by data management needs. The project needs and priorities are reassessed regularly with the users. After each user session we identify development priorities to match the identified user priorities. For instance, data QA/QC and collection activities have focused on the data and products needed for on-going scientific analyses (e.g. water level and geochemistry). We have also developed, tested and released a broker and portal that integrates diverse datasets from two different databases used for curation of project data. The development of the user interface was based on a user-centered design process involving several user interviews and constant interaction with data providers. The initial version focuses on the most requested feature - i.e. finding the data needed for analyses through an intuitive interface. Once the data is found, the user can immediately plot and download data

  6. Development of a Scale to Measure Learners' Perceived Preferences and Benefits of Data-Driven Learning

    ERIC Educational Resources Information Center

    Mizumoto, Atsushi; Chujo, Kiyomi; Yokota, Kenji

    2016-01-01

    In spite of researchers' and practitioners' increasing attention to data-driven learning (DDL) and increasing numbers of DDL studies, a multi-item scale to measure learners' attitude toward DDL has not been developed thus far. In the present study, we developed and validated a psychometric scale to measure learners' perceived preferences and…

  7. Data-driven adaptive fractional order PI control for PMSM servo system with measurement noise and data dropouts.

    PubMed

    Xie, Yuanlong; Tang, Xiaoqi; Song, Bao; Zhou, Xiangdong; Guo, Yixuan

    2018-04-01

    In this paper, data-driven adaptive fractional order proportional integral (AFOPI) control is presented for permanent magnet synchronous motor (PMSM) servo system perturbed by measurement noise and data dropouts. The proposed method directly exploits the closed-loop process data for the AFOPI controller design under unknown noise distribution and data missing probability. Firstly, the proposed method constructs the AFOPI controller tuning problem as a parameter identification problem using the modified l p norm virtual reference feedback tuning (VRFT). Then, iteratively reweighted least squares is integrated into the l p norm VRFT to give a consistent compensation solution for the AFOPI controller. The measurement noise and data dropouts are estimated and eliminated by feedback compensation periodically, so that the AFOPI controller is updated online to accommodate the time-varying operating conditions. Moreover, the convergence and stability are guaranteed by mathematical analysis. Finally, the effectiveness of the proposed method is demonstrated both on simulations and experiments implemented on a practical PMSM servo system. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.

  8. Data-driven modeling and predictive control for boiler-turbine unit using fuzzy clustering and subspace methods.

    PubMed

    Wu, Xiao; Shen, Jiong; Li, Yiguo; Lee, Kwang Y

    2014-05-01

    This paper develops a novel data-driven fuzzy modeling strategy and predictive controller for boiler-turbine unit using fuzzy clustering and subspace identification (SID) methods. To deal with the nonlinear behavior of boiler-turbine unit, fuzzy clustering is used to provide an appropriate division of the operation region and develop the structure of the fuzzy model. Then by combining the input data with the corresponding fuzzy membership functions, the SID method is extended to extract the local state-space model parameters. Owing to the advantages of the both methods, the resulting fuzzy model can represent the boiler-turbine unit very closely, and a fuzzy model predictive controller is designed based on this model. As an alternative approach, a direct data-driven fuzzy predictive control is also developed following the same clustering and subspace methods, where intermediate subspace matrices developed during the identification procedure are utilized directly as the predictor. Simulation results show the advantages and effectiveness of the proposed approach. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  9. Electromagnetic Properties Analysis on Hybrid-driven System of Electromagnetic Motor

    NASA Astrophysics Data System (ADS)

    Zhao, Jingbo; Han, Bingyuan; Bei, Shaoyi

    2018-01-01

    The hybrid-driven system made of permanent-and electromagnets applied in the electromagnetic motor was analyzed, equivalent magnetic circuit was used to establish the mathematical models of hybrid-driven system, based on the models of hybrid-driven system, the air gap flux, air-gap magnetic flux density, electromagnetic force was proposed. Taking the air-gap magnetic flux density and electromagnetic force as main research object, the hybrid-driven system was researched. Electromagnetic properties of hybrid-driven system with different working current modes is studied preliminary. The results shown that analysis based on hybrid-driven system can improve the air-gap magnetic flux density and electromagnetic force more effectively and can also guarantee the output stability, the effectiveness and feasibility of the hybrid-driven system are verified, which proved theoretical basis for the design of hybrid-driven system.

  10. Developing Data Citations from Digital Object Identifier Metadata

    NASA Technical Reports Server (NTRS)

    James, Nathan; Wanchoo, Lalit

    2015-01-01

    NASA's Earth Science Data and Information System (ESDIS) Project has been processing information for the registration of Digital Object Identifiers (DOI) for the last five years of which an automated system has been in operation for the last two years. The ESDIS DOI registration system has registered over 2000 DOIs with over 1000 DOIs held in reserve until all required information has been collected. By working towards the goal of assigning DOIs to the 8000+ data collections under its management, ESDIS has taken the first step towards facilitating the use of data citations with those products. Jeanne Behnke, ESDIS Deputy Project Manager has reviewed and approved the poster.

  11. User-driven Cloud Implementation of environmental models and data for all

    NASA Astrophysics Data System (ADS)

    Gurney, R. J.; Percy, B. J.; Elkhatib, Y.; Blair, G. S.

    2014-12-01

    Environmental data and models come from disparate sources over a variety of geographical and temporal scales with different resolutions and data standards, often including terabytes of data and model simulations. Unfortunately, these data and models tend to remain solely within the custody of the private and public organisations which create the data, and the scientists who build models and generate results. Although many models and datasets are theoretically available to others, the lack of ease of access tends to keep them out of reach of many. We have developed an intuitive web-based tool that utilises environmental models and datasets located in a cloud to produce results that are appropriate to the user. Storyboards showing the interfaces and visualisations have been created for each of several exemplars. A library of virtual machine images has been prepared to serve these exemplars. Each virtual machine image has been tailored to run computer models appropriate to the end user. Two approaches have been used; first as RESTful web services conforming to the Open Geospatial Consortium (OGC) Web Processing Service (WPS) interface standard using the Python-based PyWPS; second, a MySQL database interrogated using PHP code. In all cases, the web client sends the server an HTTP GET request to execute the process with a number of parameter values and, once execution terminates, an XML or JSON response is sent back and parsed at the client side to extract the results. All web services are stateless, i.e. application state is not maintained by the server, reducing its operational overheads and simplifying infrastructure management tasks such as load balancing and failure recovery. A hybrid cloud solution has been used with models and data sited on both private and public clouds. The storyboards have been transformed into intuitive web interfaces at the client side using HTML, CSS and JavaScript, utilising plug-ins such as jQuery and Flot (for graphics), and Google Maps

  12. Data-driven and hybrid coastal morphological prediction methods for mesoscale forecasting

    NASA Astrophysics Data System (ADS)

    Reeve, Dominic E.; Karunarathna, Harshinie; Pan, Shunqi; Horrillo-Caraballo, Jose M.; Różyński, Grzegorz; Ranasinghe, Roshanka

    2016-03-01

    It is now common for coastal planning to anticipate changes anywhere from 70 to 100 years into the future. The process models developed and used for scheme design or for large-scale oceanography are currently inadequate for this task. This has prompted the development of a plethora of alternative methods. Some, such as reduced complexity or hybrid models simplify the governing equations retaining processes that are considered to govern observed morphological behaviour. The computational cost of these models is low and they have proven effective in exploring morphodynamic trends and improving our understanding of mesoscale behaviour. One drawback is that there is no generally agreed set of principles on which to make the simplifying assumptions and predictions can vary considerably between models. An alternative approach is data-driven techniques that are based entirely on analysis and extrapolation of observations. Here, we discuss the application of some of the better known and emerging methods in this category to argue that with the increasing availability of observations from coastal monitoring programmes and the development of more sophisticated statistical analysis techniques data-driven models provide a valuable addition to the armoury of methods available for mesoscale prediction. The continuation of established monitoring programmes is paramount, and those that provide contemporaneous records of the driving forces and the shoreline response are the most valuable in this regard. In the second part of the paper we discuss some recent research that combining some of the hybrid techniques with data analysis methods in order to synthesise a more consistent means of predicting mesoscale coastal morphological evolution. While encouraging in certain applications a universally applicable approach has yet to be found. The route to linking different model types is highlighted as a major challenge and requires further research to establish its viability. We argue that

  13. 78 FR 68073 - Announcement of Solicitation of Written Comments on Modifications of Healthy People 2020 Objectives

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-13

    ..., health-service, or policy interventions. 3. Objectives should drive actions that will work toward the... populations categorized by race/ethnicity, socioeconomic status, gender, disability status, sexual orientation... care. 9. Healthy People 2020, like past versions, is heavily data driven. Valid, reliable, nationally...

  14. 77 FR 62514 - Announcement of Solicitation of Written Comments on Modifications of Healthy People 2020 Objectives

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-15

    ... interventions. 3. Objectives should drive actions that will work toward the achievement of the proposed targets... populations categorized by race/ethnicity, socioeconomic status, gender, disability status, sexual orientation... care. 9. Healthy People 2020, like past versions, will be heavily data driven. Valid, reliable...

  15. Multiple object tracking with non-unique data-to-object association via generalized hypothesis testing. [tracking several aircraft near each other or ships at sea

    NASA Technical Reports Server (NTRS)

    Porter, D. W.; Lefler, R. M.

    1979-01-01

    A generalized hypothesis testing approach is applied to the problem of tracking several objects where several different associations of data with objects are possible. Such problems occur, for instance, when attempting to distinctly track several aircraft maneuvering near each other or when tracking ships at sea. Conceptually, the problem is solved by first, associating data with objects in a statistically reasonable fashion and then, tracking with a bank of Kalman filters. The objects are assumed to have motion characterized by a fixed but unknown deterministic portion plus a random process portion modeled by a shaping filter. For example, the object might be assumed to have a mean straight line path about which it maneuvers in a random manner. Several hypothesized associations of data with objects are possible because of ambiguity as to which object the data comes from, false alarm/detection errors, and possible uncertainty in the number of objects being tracked. The statistical likelihood function is computed for each possible hypothesized association of data with objects. Then the generalized likelihood is computed by maximizing the likelihood over parameters that define the deterministic motion of the object.

  16. Validation of buoyancy driven spectral tensor model using HATS data

    NASA Astrophysics Data System (ADS)

    Chougule, A.; Mann, J.; Kelly, M.; Larsen, G. C.

    2016-09-01

    We present a homogeneous spectral tensor model for wind velocity and temperature fluctuations, driven by mean vertical shear and mean temperature gradient. Results from the model, including one-dimensional velocity and temperature spectra and the associated co-spectra, are shown in this paper. The model also reproduces two-point statistics, such as coherence and phases, via cross-spectra between two points separated in space. Model results are compared with observations from the Horizontal Array Turbulence Study (HATS) field program (Horst et al. 2004). The spectral velocity tensor in the model is described via five parameters: the dissipation rate (ɛ), length scale of energy-containing eddies (L), a turbulence anisotropy parameter (Γ), gradient Richardson number (Ri) representing the atmospheric stability and the rate of destruction of temperature variance (ηθ).

  17. Experimental studies of characteristic combustion-driven flows for CFD validation

    NASA Technical Reports Server (NTRS)

    Santoro, R. J.; Moser, M.; Anderson, W.; Pal, S.; Ryan, H.; Merkle, C. L.

    1992-01-01

    A series of rocket-related studies intended to develop a suitable data base for validation of Computational Fluid Dynamics (CFD) models of characteristic combustion-driven flows was undertaken at the Propulsion Engineering Research Center at Penn State. Included are studies of coaxial and impinging jet injectors as well as chamber wall heat transfer effects. The objective of these studies is to provide fundamental understanding and benchmark quality data for phenomena important to rocket combustion under well-characterized conditions. Diagnostic techniques utilized in these studies emphasize determinations of velocity, temperature, spray and droplet characteristics, and combustion zone distribution. Since laser diagnostic approaches are favored, the development of an optically accessible rocket chamber has been a high priority in the initial phase of the project. During the design phase for this chamber, the advice and input of the CFD modeling community were actively sought through presentations and written surveys. Based on this procedure, a suitable uni-element rocket chamber was fabricated and is presently under preliminary testing. Results of these tests, as well as the survey findings leading to the chamber design, were presented.

  18. Collaborative Project: The problem of bias in defining uncertainty in computationally enabled strategies for data-driven climate model development. Final Technical Report.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huerta, Gabriel

    The objective of the project is to develop strategies for better representing scientific sensibilities within statistical measures of model skill that then can be used within a Bayesian statistical framework for data-driven climate model development and improved measures of model scientific uncertainty. One of the thorny issues in model evaluation is quantifying the effect of biases on climate projections. While any bias is not desirable, only those biases that affect feedbacks affect scatter in climate projections. The effort at the University of Texas is to analyze previously calculated ensembles of CAM3.1 with perturbed parameters to discover how biases affect projectionsmore » of global warming. The hypothesis is that compensating errors in the control model can be identified by their effect on a combination of processes and that developing metrics that are sensitive to dependencies among state variables would provide a way to select version of climate models that may reduce scatter in climate projections. Gabriel Huerta at the University of New Mexico is responsible for developing statistical methods for evaluating these field dependencies. The UT effort will incorporate these developments into MECS, which is a set of python scripts being developed at the University of Texas for managing the workflow associated with data-driven climate model development over HPC resources. This report reflects the main activities at the University of New Mexico where the PI (Huerta) and the Postdocs (Nosedal, Hattab and Karki) worked on the project.« less

  19. The power of event-driven analytics in Large Scale Data Processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sebastiao, Nuno; Marques, Paulo

    2011-02-24

    FeedZai is a software company specialized in creating high-­‐throughput low-­‐latency data processing solutions. FeedZai develops a product called "FeedZai Pulse" for continuous event-­‐driven analytics that makes application development easier for end users. It automatically calculates key performance indicators and baselines, showing how current performance differ from previous history, creating timely business intelligence updated to the second. The tool does predictive analytics and trend analysis, displaying data on real-­‐time web-­‐based graphics. In 2010 FeedZai won the European EBN Smart Entrepreneurship Competition, in the Digital Models category, being considered one of the "top-­‐20 smart companies in Europe". The main objective of thismore » seminar/workshop is to explore the topic for large-­‐scale data processing using Complex Event Processing and, in particular, the possible uses of Pulse in the scope of the data processing needs of CERN. Pulse is available as open-­‐source and can be licensed both for non-­‐commercial and commercial applications. FeedZai is interested in exploring possible synergies with CERN in high-­‐volume low-­‐latency data processing applications. The seminar will be structured in two sessions, the first one being aimed to expose the general scope of FeedZai's activities, and the second focused on Pulse itself: 10:00-11:00 FeedZai and Large Scale Data Processing; Introduction to FeedZai; FeedZai Pulse and Complex Event Processing; Demonstration; Use-Cases and Applications; Conclusion and Q&A. 11:00-11:15 Coffee break 11:15-12:30 FeedZai Pulse Under the Hood; A First FeedZai Pulse Application; PulseQL overview; Defining KPIs and Baselines; Conclusion and Q&A. About the speakers Nuno Sebastião is the CEO of FeedZai. Having worked for many years for the European Space Agency (ESA), he was responsible the overall design and development of Satellite Simulation Infrastructure of the agency. Having

  20. The power of event-driven analytics in Large Scale Data Processing

    ScienceCinema

    None

    2017-12-09

    FeedZai is a software company specialized in creating high-­-throughput low-­-latency data processing solutions. FeedZai develops a product called "FeedZai Pulse" for continuous event-­-driven analytics that makes application development easier for end users. It automatically calculates key performance indicators and baselines, showing how current performance differ from previous history, creating timely business intelligence updated to the second. The tool does predictive analytics and trend analysis, displaying data on real-­-time web-­-based graphics. In 2010 FeedZai won the European EBN Smart Entrepreneurship Competition, in the Digital Models category, being considered one of the "top-­-20 smart companies in Europe". The main objective of this seminar/workshop is to explore the topic for large-­-scale data processing using Complex Event Processing and, in particular, the possible uses of Pulse in the scope of the data processing needs of CERN. Pulse is available as open-­-source and can be licensed both for non-­-commercial and commercial applications. FeedZai is interested in exploring possible synergies with CERN in high-­-volume low-­-latency data processing applications. The seminar will be structured in two sessions, the first one being aimed to expose the general scope of FeedZai's activities, and the second focused on Pulse itself: 10:00-11:00 FeedZai and Large Scale Data Processing Introduction to FeedZai FeedZai Pulse and Complex Event Processing Demonstration Use-Cases and Applications Conclusion and Q&A 11:00-11:15 Coffee break 11:15-12:30 FeedZai Pulse Under the Hood A First FeedZai Pulse Application PulseQL overview Defining KPIs and Baselines Conclusion and Q&A About the speakers Nuno Sebastião is the CEO of FeedZai. Having worked for many years for the European Space Agency (ESA), he was responsible the overall design and development of Satellite Simulation Infrastructure of the agency. Having left ESA to found FeedZai, Nuno is