data-driven modeling approach: Topics by Science.gov

Sample records for data-driven modeling approach

Data-driven Modelling for decision making under uncertainty

NASA Astrophysics Data System (ADS)

Angria S, Layla; Dwi Sari, Yunita; Zarlis, Muhammad; Tulus

2018-01-01

The rise of the issues with the uncertainty of decision making has become a very warm conversation in operation research. Many models have been presented, one of which is with data-driven modelling (DDM). The purpose of this paper is to extract and recognize patterns in data, and find the best model in decision-making problem under uncertainty by using data-driven modeling approach with linear programming, linear and nonlinear differential equation, bayesian approach. Model criteria tested to determine the smallest error, and it will be the best model that can be used.
A model of strength

USGS Publications Warehouse

Johnson, Douglas H.; Cook, R.D.

2013-01-01

In her AAAS News & Notes piece "Can the Southwest manage its thirst?" (26 July, p. 362), K. Wren quotes Ajay Kalra, who advocates a particular method for predicting Colorado River streamflow "because it eschews complex physical climate models for a statistical data-driven modeling approach." A preference for data-driven models may be appropriate in this individual situation, but it is not so generally, Data-driven models often come with a warning against extrapolating beyond the range of the data used to develop the models. When the future is like the past, data-driven models can work well for prediction, but it is easy to over-model local or transient phenomena, often leading to predictive inaccuracy (1). Mechanistic models are built on established knowledge of the process that connects the response variables with the predictors, using information obtained outside of an extant data set. One may shy away from a mechanistic approach when the underlying process is judged to be too complicated, but good predictive models can be constructed with statistical components that account for ingredients missing in the mechanistic analysis. Models with sound mechanistic components are more generally applicable and robust than data-driven models.
Accurate position estimation methods based on electrical impedance tomography measurements

NASA Astrophysics Data System (ADS)

Vergara, Samuel; Sbarbaro, Daniel; Johansen, T. A.

2017-08-01

Electrical impedance tomography (EIT) is a technology that estimates the electrical properties of a body or a cross section. Its main advantages are its non-invasiveness, low cost and operation free of radiation. The estimation of the conductivity field leads to low resolution images compared with other technologies, and high computational cost. However, in many applications the target information lies in a low intrinsic dimensionality of the conductivity field. The estimation of this low-dimensional information is addressed in this work. It proposes optimization-based and data-driven approaches for estimating this low-dimensional information. The accuracy of the results obtained with these approaches depends on modelling and experimental conditions. Optimization approaches are sensitive to model discretization, type of cost function and searching algorithms. Data-driven methods are sensitive to the assumed model structure and the data set used for parameter estimation. The system configuration and experimental conditions, such as number of electrodes and signal-to-noise ratio (SNR), also have an impact on the results. In order to illustrate the effects of all these factors, the position estimation of a circular anomaly is addressed. Optimization methods based on weighted error cost functions and derivate-free optimization algorithms provided the best results. Data-driven approaches based on linear models provided, in this case, good estimates, but the use of nonlinear models enhanced the estimation accuracy. The results obtained by optimization-based algorithms were less sensitive to experimental conditions, such as number of electrodes and SNR, than data-driven approaches. Position estimation mean squared errors for simulation and experimental conditions were more than twice for the optimization-based approaches compared with the data-driven ones. The experimental position estimation mean squared error of the data-driven models using a 16-electrode setup was less than 0.05% of the tomograph radius value. These results demonstrate that the proposed approaches can estimate an object’s position accurately based on EIT measurements if enough process information is available for training or modelling. Since they do not require complex calculations it is possible to use them in real-time applications without requiring high-performance computers.
A data driven control method for structure vibration suppression

NASA Astrophysics Data System (ADS)

Xie, Yangmin; Wang, Chao; Shi, Hang; Shi, Junwei

2018-02-01

High radio-frequency space applications have motivated continuous research on vibration suppression of large space structures both in academia and industry. This paper introduces a novel data driven control method to suppress vibrations of flexible structures and experimentally validates the suppression performance. Unlike model-based control approaches, the data driven control method designs a controller directly from the input-output test data of the structure, without requiring parametric dynamics and hence free of system modeling. It utilizes the discrete frequency response via spectral analysis technique and formulates a non-convex optimization problem to obtain optimized controller parameters with a predefined controller structure. Such approach is then experimentally applied on an end-driving flexible beam-mass structure. The experiment results show that the presented method can achieve competitive disturbance rejections compared to a model-based mixed sensitivity controller under the same design criterion but with much less orders and design efforts, demonstrating the proposed data driven control is an effective approach for vibration suppression of flexible structures.
Contrasting analytical and data-driven frameworks for radiogenomic modeling of normal tissue toxicities in prostate cancer.

PubMed

Coates, James; Jeyaseelan, Asha K; Ybarra, Norma; David, Marc; Faria, Sergio; Souhami, Luis; Cury, Fabio; Duclos, Marie; El Naqa, Issam

2015-04-01

We explore analytical and data-driven approaches to investigate the integration of genetic variations (single nucleotide polymorphisms [SNPs] and copy number variations [CNVs]) with dosimetric and clinical variables in modeling radiation-induced rectal bleeding (RB) and erectile dysfunction (ED) in prostate cancer patients. Sixty-two patients who underwent curative hypofractionated radiotherapy (66 Gy in 22 fractions) between 2002 and 2010 were retrospectively genotyped for CNV and SNP rs5489 in the xrcc1 DNA repair gene. Fifty-four patients had full dosimetric profiles. Two parallel modeling approaches were compared to assess the risk of severe RB (Grade⩾3) and ED (Grade⩾1); Maximum likelihood estimated generalized Lyman-Kutcher-Burman (LKB) and logistic regression. Statistical resampling based on cross-validation was used to evaluate model predictive power and generalizability to unseen data. Integration of biological variables xrcc1 CNV and SNP improved the fit of the RB and ED analytical and data-driven models. Cross-validation of the generalized LKB models yielded increases in classification performance of 27.4% for RB and 14.6% for ED when xrcc1 CNV and SNP were included, respectively. Biological variables added to logistic regression modeling improved classification performance over standard dosimetric models by 33.5% for RB and 21.2% for ED models. As a proof-of-concept, we demonstrated that the combination of genetic and dosimetric variables can provide significant improvement in NTCP prediction using analytical and data-driven approaches. The improvement in prediction performance was more pronounced in the data driven approaches. Moreover, we have shown that CNVs, in addition to SNPs, may be useful structural genetic variants in predicting radiation toxicities. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Reverse engineering biomolecular systems using -omic data: challenges, progress and opportunities.

PubMed

Quo, Chang F; Kaddi, Chanchala; Phan, John H; Zollanvari, Amin; Xu, Mingqing; Wang, May D; Alterovitz, Gil

2012-07-01

Recent advances in high-throughput biotechnologies have led to the rapid growing research interest in reverse engineering of biomolecular systems (REBMS). 'Data-driven' approaches, i.e. data mining, can be used to extract patterns from large volumes of biochemical data at molecular-level resolution while 'design-driven' approaches, i.e. systems modeling, can be used to simulate emergent system properties. Consequently, both data- and design-driven approaches applied to -omic data may lead to novel insights in reverse engineering biological systems that could not be expected before using low-throughput platforms. However, there exist several challenges in this fast growing field of reverse engineering biomolecular systems: (i) to integrate heterogeneous biochemical data for data mining, (ii) to combine top-down and bottom-up approaches for systems modeling and (iii) to validate system models experimentally. In addition to reviewing progress made by the community and opportunities encountered in addressing these challenges, we explore the emerging field of synthetic biology, which is an exciting approach to validate and analyze theoretical system models directly through experimental synthesis, i.e. analysis-by-synthesis. The ultimate goal is to address the present and future challenges in reverse engineering biomolecular systems (REBMS) using integrated workflow of data mining, systems modeling and synthetic biology.
Data Driven Model Development for the Supersonic Semispan Transport (S(sup 4)T)

NASA Technical Reports Server (NTRS)

Kukreja, Sunil L.

2011-01-01

We investigate two common approaches to model development for robust control synthesis in the aerospace community; namely, reduced order aeroservoelastic modelling based on structural finite-element and computational fluid dynamics based aerodynamic models and a data-driven system identification procedure. It is shown via analysis of experimental Super- Sonic SemiSpan Transport (S4T) wind-tunnel data using a system identification approach it is possible to estimate a model at a fixed Mach, which is parsimonious and robust across varying dynamic pressures.
A data-driven approach to identify controls on global fire activity from satellite and climate observations (SOFIA V1)

NASA Astrophysics Data System (ADS)

Forkel, Matthias; Dorigo, Wouter; Lasslop, Gitta; Teubner, Irene; Chuvieco, Emilio; Thonicke, Kirsten

2017-12-01

Vegetation fires affect human infrastructures, ecosystems, global vegetation distribution, and atmospheric composition. However, the climatic, environmental, and socioeconomic factors that control global fire activity in vegetation are only poorly understood, and in various complexities and formulations are represented in global process-oriented vegetation-fire models. Data-driven model approaches such as machine learning algorithms have successfully been used to identify and better understand controlling factors for fire activity. However, such machine learning models cannot be easily adapted or even implemented within process-oriented global vegetation-fire models. To overcome this gap between machine learning-based approaches and process-oriented global fire models, we introduce a new flexible data-driven fire modelling approach here (Satellite Observations to predict FIre Activity, SOFIA approach version 1). SOFIA models can use several predictor variables and functional relationships to estimate burned area that can be easily adapted with more complex process-oriented vegetation-fire models. We created an ensemble of SOFIA models to test the importance of several predictor variables. SOFIA models result in the highest performance in predicting burned area if they account for a direct restriction of fire activity under wet conditions and if they include a land cover-dependent restriction or allowance of fire activity by vegetation density and biomass. The use of vegetation optical depth data from microwave satellite observations, a proxy for vegetation biomass and water content, reaches higher model performance than commonly used vegetation variables from optical sensors. We further analyse spatial patterns of the sensitivity between anthropogenic, climate, and vegetation predictor variables and burned area. We finally discuss how multiple observational datasets on climate, hydrological, vegetation, and socioeconomic variables together with data-driven modelling and model-data integration approaches can guide the future development of global process-oriented vegetation-fire models.
Data Driven Model Development for the SuperSonic SemiSpan Transport (S(sup 4)T)

NASA Technical Reports Server (NTRS)

Kukreja, Sunil L.

2011-01-01

In this report, we will investigate two common approaches to model development for robust control synthesis in the aerospace community; namely, reduced order aeroservoelastic modelling based on structural finite-element and computational fluid dynamics based aerodynamic models, and a data-driven system identification procedure. It is shown via analysis of experimental SuperSonic SemiSpan Transport (S4T) wind-tunnel data that by using a system identification approach it is possible to estimate a model at a fixed Mach, which is parsimonious and robust across varying dynamic pressures.
Open data models for smart health interconnected applications: the example of openEHR.

PubMed

Demski, Hans; Garde, Sebastian; Hildebrand, Claudia

2016-10-22

Smart Health is known as a concept that enhances networking, intelligent data processing and combining patient data with other parameters. Open data models can play an important role in creating a framework for providing interoperable data services that support the development of innovative Smart Health applications profiting from data fusion and sharing. This article describes a model-driven engineering approach based on standardized clinical information models and explores its application for the development of interoperable electronic health record systems. The following possible model-driven procedures were considered: provision of data schemes for data exchange, automated generation of artefacts for application development and native platforms that directly execute the models. The applicability of the approach in practice was examined using the openEHR framework as an example. A comprehensive infrastructure for model-driven engineering of electronic health records is presented using the example of the openEHR framework. It is shown that data schema definitions to be used in common practice software development processes can be derived from domain models. The capabilities for automatic creation of implementation artefacts (e.g., data entry forms) are demonstrated. Complementary programming libraries and frameworks that foster the use of open data models are introduced. Several compatible health data platforms are listed. They provide standard based interfaces for interconnecting with further applications. Open data models help build a framework for interoperable data services that support the development of innovative Smart Health applications. Related tools for model-driven application development foster semantic interoperability and interconnected innovative applications.
Data-driven integration of genome-scale regulatory and metabolic network models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Imam, Saheed; Schauble, Sascha; Brooks, Aaron N.

Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or moremore » network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. Lastly, in this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.« less
Data-driven integration of genome-scale regulatory and metabolic network models

DOE PAGES

Imam, Saheed; Schauble, Sascha; Brooks, Aaron N.; ...

2015-05-05

Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or moremore » network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. Lastly, in this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.« less
Data-driven Modeling of Metal-oxide Sensors with Dynamic Bayesian Networks

NASA Astrophysics Data System (ADS)

Gosangi, Rakesh; Gutierrez-Osuna, Ricardo

2011-09-01

We present a data-driven probabilistic framework to model the transient response of MOX sensors modulated with a sequence of voltage steps. Analytical models of MOX sensors are usually built based on the physico-chemical properties of the sensing materials. Although building these models provides an insight into the sensor behavior, they also require a thorough understanding of the underlying operating principles. Here we propose a data-driven approach to characterize the dynamical relationship between sensor inputs and outputs. Namely, we use dynamic Bayesian networks (DBNs), probabilistic models that represent temporal relations between a set of random variables. We identify a set of control variables that influence the sensor responses, create a graphical representation that captures the causal relations between these variables, and finally train the model with experimental data. We validated the approach on experimental data in terms of predictive accuracy and classification performance. Our results show that DBNs can accurately predict the dynamic response of MOX sensors, as well as capture the discriminatory information present in the sensor transients.
Data-Driven Learning of Q-Matrix

ERIC Educational Resources Information Center

Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

2012-01-01

The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known "Q"-matrix, which specifies the item-attribute relationships. This article proposes a data-driven approach to identification of the "Q"-matrix and estimation of…
Empirical Performance Model-Driven Data Layout Optimization and Library Call Selection for Tensor Contraction Expressions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, Qingda; Gao, Xiaoyang; Krishnamoorthy, Sriram

Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of loop unrolling is determined by executing different versions of the computation. In contrast, optimizing compilers use a model-driven approach to program transformation. While the model-driven approach of optimizing compilers is generally orders of magnitude faster than ATLAS-like library generators, its effectiveness can be limited by the accuracy of the performance models used. In this paper, we describe an approach where a class of computations is modeled in terms of constituent operations that are empiricallymore » measured, thereby allowing modeling of the overall execution time. The performance model with empirically determined cost components is used to perform data layout optimization together with the selection of library calls and layout transformations in the context of the Tensor Contraction Engine, a compiler for a high-level domain-specific language for expressing computational models in quantum chemistry. The effectiveness of the approach is demonstrated through experimental measurements on representative computations from quantum chemistry.« less
Vector Autoregression, Structural Equation Modeling, and Their Synthesis in Neuroimaging Data Analysis

PubMed Central

Chen, Gang; Glen, Daniel R.; Saad, Ziad S.; Hamilton, J. Paul; Thomason, Moriah E.; Gotlib, Ian H.; Cox, Robert W.

2011-01-01

Vector autoregression (VAR) and structural equation modeling (SEM) are two popular brain-network modeling tools. VAR, which is a data-driven approach, assumes that connected regions exert time-lagged influences on one another. In contrast, the hypothesis-driven SEM is used to validate an existing connectivity model where connected regions have contemporaneous interactions among them. We present the two models in detail and discuss their applicability to FMRI data, and interpretational limits. We also propose a unified approach that models both lagged and contemporaneous effects. The unifying model, structural vector autoregression (SVAR), may improve statistical and explanatory power, and avoids some prevalent pitfalls that can occur when VAR and SEM are utilized separately. PMID:21975109
Combining Knowledge and Data Driven Insights for Identifying Risk Factors using Electronic Health Records

PubMed Central

Sun, Jimeng; Hu, Jianying; Luo, Dijun; Markatou, Marianthi; Wang, Fei; Edabollahi, Shahram; Steinhubl, Steven E.; Daar, Zahra; Stewart, Walter F.

2012-01-01

Background: The ability to identify the risk factors related to an adverse condition, e.g., heart failures (HF) diagnosis, is very important for improving care quality and reducing cost. Existing approaches for risk factor identification are either knowledge driven (from guidelines or literatures) or data driven (from observational data). No existing method provides a model to effectively combine expert knowledge with data driven insight for risk factor identification. Methods: We present a systematic approach to enhance known knowledge-based risk factors with additional potential risk factors derived from data. The core of our approach is a sparse regression model with regularization terms that correspond to both knowledge and data driven risk factors. Results: The approach is validated using a large dataset containing 4,644 heart failure cases and 45,981 controls. The outpatient electronic health records (EHRs) for these patients include diagnosis, medication, lab results from 2003–2010. We demonstrate that the proposed method can identify complementary risk factors that are not in the existing known factors and can better predict the onset of HF. We quantitatively compare different sets of risk factors in the context of predicting onset of HF using the performance metric, the Area Under the ROC Curve (AUC). The combined risk factors between knowledge and data significantly outperform knowledge-based risk factors alone. Furthermore, those additional risk factors are confirmed to be clinically meaningful by a cardiologist. Conclusion: We present a systematic framework for combining knowledge and data driven insights for risk factor identification. We demonstrate the power of this framework in the context of predicting onset of HF, where our approach can successfully identify intuitive and predictive risk factors beyond a set of known HF risk factors. PMID:23304365
Autonomous Soil Assessment System: A Data-Driven Approach to Planetary Mobility Hazard Detection

NASA Astrophysics Data System (ADS)

Raimalwala, K.; Faragalli, M.; Reid, E.

2018-04-01

The Autonomous Soil Assessment System predicts mobility hazards for rovers. Its development and performance are presented, with focus on its data-driven models, machine learning algorithms, and real-time sensor data fusion for predictive analytics.
Computational Fluid Dynamics Simulation of Flows in an Oxidation Ditch Driven by a New Surface Aerator.

PubMed

Huang, Weidong; Li, Kun; Wang, Gan; Wang, Yingzhe

2013-11-01

In this article, we present a newly designed inverse umbrella surface aerator, and tested its performance in driving flow of an oxidation ditch. Results show that it has a better performance in driving the oxidation ditch than the original one with higher average velocity and more uniform flow field. We also present a computational fluid dynamics model for predicting the flow field in an oxidation ditch driven by a surface aerator. The improved momentum source term approach to simulate the flow field of the oxidation ditch driven by an inverse umbrella surface aerator was developed and validated through experiments. Four kinds of turbulent models were investigated with the approach, including the standard k - ɛ model, RNG k - ɛ model, realizable k - ɛ model, and Reynolds stress model, and the predicted data were compared with those calculated with the multiple rotating reference frame approach (MRF) and sliding mesh approach (SM). Results of the momentum source term approach are in good agreement with the experimental data, and its prediction accuracy is better than MRF, close to SM. It is also found that the momentum source term approach has lower computational expenses, is simpler to preprocess, and is easier to use.
Human systems immunology: hypothesis-based modeling and unbiased data-driven approaches.

PubMed

Arazi, Arnon; Pendergraft, William F; Ribeiro, Ruy M; Perelson, Alan S; Hacohen, Nir

2013-10-31

Systems immunology is an emerging paradigm that aims at a more systematic and quantitative understanding of the immune system. Two major approaches have been utilized to date in this field: unbiased data-driven modeling to comprehensively identify molecular and cellular components of a system and their interactions; and hypothesis-based quantitative modeling to understand the operating principles of a system by extracting a minimal set of variables and rules underlying them. In this review, we describe applications of the two approaches to the study of viral infections and autoimmune diseases in humans, and discuss possible ways by which these two approaches can synergize when applied to human immunology. Copyright © 2012 Elsevier Ltd. All rights reserved.

Combining Model-Based and Feature-Driven Diagnosis Approaches - A Case Study on Electromechanical Actuators

NASA Technical Reports Server (NTRS)

Narasimhan, Sriram; Roychoudhury, Indranil; Balaban, Edward; Saxena, Abhinav

2010-01-01

Model-based diagnosis typically uses analytical redundancy to compare predictions from a model against observations from the system being diagnosed. However this approach does not work very well when it is not feasible to create analytic relations describing all the observed data, e.g., for vibration data which is usually sampled at very high rates and requires very detailed finite element models to describe its behavior. In such cases, features (in time and frequency domains) that contain diagnostic information are extracted from the data. Since this is a computationally intensive process, it is not efficient to extract all the features all the time. In this paper we present an approach that combines the analytic model-based and feature-driven diagnosis approaches. The analytic approach is used to reduce the set of possible faults and then features are chosen to best distinguish among the remaining faults. We describe an implementation of this approach on the Flyable Electro-mechanical Actuator (FLEA) test bed.
A Conceptual Analytics Model for an Outcome-Driven Quality Management Framework as Part of Professional Healthcare Education.

PubMed

Hervatis, Vasilis; Loe, Alan; Barman, Linda; O'Donoghue, John; Zary, Nabil

2015-10-06

Preparing the future health care professional workforce in a changing world is a significant undertaking. Educators and other decision makers look to evidence-based knowledge to improve quality of education. Analytics, the use of data to generate insights and support decisions, have been applied successfully across numerous application domains. Health care professional education is one area where great potential is yet to be realized. Previous research of Academic and Learning analytics has mainly focused on technical issues. The focus of this study relates to its practical implementation in the setting of health care education. The aim of this study is to create a conceptual model for a deeper understanding of the synthesizing process, and transforming data into information to support educators' decision making. A deductive case study approach was applied to develop the conceptual model. The analytics loop works both in theory and in practice. The conceptual model encompasses the underlying data, the quality indicators, and decision support for educators. The model illustrates how a theory can be applied to a traditional data-driven analytics approach, and alongside the context- or need-driven analytics approach.
A Conceptual Analytics Model for an Outcome-Driven Quality Management Framework as Part of Professional Healthcare Education

PubMed Central

Loe, Alan; Barman, Linda; O'Donoghue, John; Zary, Nabil

2015-01-01

Background Preparing the future health care professional workforce in a changing world is a significant undertaking. Educators and other decision makers look to evidence-based knowledge to improve quality of education. Analytics, the use of data to generate insights and support decisions, have been applied successfully across numerous application domains. Health care professional education is one area where great potential is yet to be realized. Previous research of Academic and Learning analytics has mainly focused on technical issues. The focus of this study relates to its practical implementation in the setting of health care education. Objective The aim of this study is to create a conceptual model for a deeper understanding of the synthesizing process, and transforming data into information to support educators’ decision making. Methods A deductive case study approach was applied to develop the conceptual model. Results The analytics loop works both in theory and in practice. The conceptual model encompasses the underlying data, the quality indicators, and decision support for educators. Conclusions The model illustrates how a theory can be applied to a traditional data-driven analytics approach, and alongside the context- or need-driven analytics approach. PMID:27731840
Data-driven modeling and predictive control for boiler-turbine unit using fuzzy clustering and subspace methods.

PubMed

Wu, Xiao; Shen, Jiong; Li, Yiguo; Lee, Kwang Y

2014-05-01

This paper develops a novel data-driven fuzzy modeling strategy and predictive controller for boiler-turbine unit using fuzzy clustering and subspace identification (SID) methods. To deal with the nonlinear behavior of boiler-turbine unit, fuzzy clustering is used to provide an appropriate division of the operation region and develop the structure of the fuzzy model. Then by combining the input data with the corresponding fuzzy membership functions, the SID method is extended to extract the local state-space model parameters. Owing to the advantages of the both methods, the resulting fuzzy model can represent the boiler-turbine unit very closely, and a fuzzy model predictive controller is designed based on this model. As an alternative approach, a direct data-driven fuzzy predictive control is also developed following the same clustering and subspace methods, where intermediate subspace matrices developed during the identification procedure are utilized directly as the predictor. Simulation results show the advantages and effectiveness of the proposed approach. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Computational Fluid Dynamics Simulation of Flows in an Oxidation Ditch Driven by a New Surface Aerator

PubMed Central

Huang, Weidong; Li, Kun; Wang, Gan; Wang, Yingzhe

2013-01-01

Abstract In this article, we present a newly designed inverse umbrella surface aerator, and tested its performance in driving flow of an oxidation ditch. Results show that it has a better performance in driving the oxidation ditch than the original one with higher average velocity and more uniform flow field. We also present a computational fluid dynamics model for predicting the flow field in an oxidation ditch driven by a surface aerator. The improved momentum source term approach to simulate the flow field of the oxidation ditch driven by an inverse umbrella surface aerator was developed and validated through experiments. Four kinds of turbulent models were investigated with the approach, including the standard k−ɛ model, RNG k−ɛ model, realizable k−ɛ model, and Reynolds stress model, and the predicted data were compared with those calculated with the multiple rotating reference frame approach (MRF) and sliding mesh approach (SM). Results of the momentum source term approach are in good agreement with the experimental data, and its prediction accuracy is better than MRF, close to SM. It is also found that the momentum source term approach has lower computational expenses, is simpler to preprocess, and is easier to use. PMID:24302850
Reflection of a Year Long Model-Driven Business and UI Modeling Development Project

NASA Astrophysics Data System (ADS)

Sukaviriya, Noi; Mani, Senthil; Sinha, Vibha

Model-driven software development enables users to specify an application at a high level - a level that better matches problem domain. It also promises the users with better analysis and automation. Our work embarks on two collaborating domains - business process and human interactions - to build an application. Business modeling expresses business operations and flows then creates business flow implementation. Human interaction modeling expresses a UI design, its relationship with business data, logic, and flow, and can generate working UI. This double modeling approach automates the production of a working system with UI and business logic connected. This paper discusses the human aspects of this modeling approach after a year long of building a procurement outsourcing contract application using the approach - the result of which was deployed in December 2008. The paper discusses in multiple areas the happy endings and some heartache. We end with insights on how a model-driven approach could do better for humans in the process.
A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty

USGS Publications Warehouse

Friedel, Michael J.

2011-01-01

This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.
The Data-Driven Approach to Spectroscopic Analyses

NASA Astrophysics Data System (ADS)

Ness, M.

2018-01-01

I review the data-driven approach to spectroscopy, The Cannon, which is a method for deriving fundamental diagnostics of galaxy formation of precise chemical compositions and stellar ages, across many stellar surveys that are mapping the Milky Way. With The Cannon, the abundances and stellar parameters from the multitude of stellar surveys can be placed directly on the same scale, using stars in common between the surveys. Furthermore, the information that resides in the data can be fully extracted, this has resulted in higher precision stellar parameters and abundances being delivered from spectroscopic data and has opened up new avenues in galactic archeology, for example, in the determination of ages for red giant stars across the Galactic disk. Coupled with Gaia distances, proper motions, and derived orbit families, the stellar age and individual abundance information delivered at the precision obtained with the data-driven approach provides very strong constraints on the evolution of and birthplace of stars in the Milky Way. I will review the role of data-driven spectroscopy as we enter the era where we have both the data and the tools to build the ultimate conglomerate of galactic information as well as highlight further applications of data-driven models in the coming decade.
Statistical and engineering methods for model enhancement

NASA Astrophysics Data System (ADS)

Chang, Chia-Jung

Models which describe the performance of physical process are essential for quality prediction, experimental planning, process control and optimization. Engineering models developed based on the underlying physics/mechanics of the process such as analytic models or finite element models are widely used to capture the deterministic trend of the process. However, there usually exists stochastic randomness in the system which may introduce the discrepancy between physics-based model predictions and observations in reality. Alternatively, statistical models can be used to develop models to obtain predictions purely based on the data generated from the process. However, such models tend to perform poorly when predictions are made away from the observed data points. This dissertation contributes to model enhancement research by integrating physics-based model and statistical model to mitigate the individual drawbacks and provide models with better accuracy by combining the strengths of both models. The proposed model enhancement methodologies including the following two streams: (1) data-driven enhancement approach and (2) engineering-driven enhancement approach. Through these efforts, more adequate models are obtained, which leads to better performance in system forecasting, process monitoring and decision optimization. Among different data-driven enhancement approaches, Gaussian Process (GP) model provides a powerful methodology for calibrating a physical model in the presence of model uncertainties. However, if the data contain systematic experimental errors, the GP model can lead to an unnecessarily complex adjustment of the physical model. In Chapter 2, we proposed a novel enhancement procedure, named as “Minimal Adjustment”, which brings the physical model closer to the data by making minimal changes to it. This is achieved by approximating the GP model by a linear regression model and then applying a simultaneous variable selection of the model and experimental bias terms. Two real examples and simulations are presented to demonstrate the advantages of the proposed approach. Different from enhancing the model based on data-driven perspective, an alternative approach is to focus on adjusting the model by incorporating the additional domain or engineering knowledge when available. This often leads to models that are very simple and easy to interpret. The concepts of engineering-driven enhancement are carried out through two applications to demonstrate the proposed methodologies. In the first application where polymer composite quality is focused, nanoparticle dispersion has been identified as a crucial factor affecting the mechanical properties. Transmission Electron Microscopy (TEM) images are commonly used to represent nanoparticle dispersion without further quantifications on its characteristics. In Chapter 3, we developed the engineering-driven nonhomogeneous Poisson random field modeling strategy to characterize nanoparticle dispersion status of nanocomposite polymer, which quantitatively represents the nanomaterial quality presented through image data. The model parameters are estimated through the Bayesian MCMC technique to overcome the challenge of limited amount of accessible data due to the time consuming sampling schemes. The second application is to calibrate the engineering-driven force models of laser-assisted micro milling (LAMM) process statistically, which facilitates a systematic understanding and optimization of targeted processes. In Chapter 4, the force prediction interval has been derived by incorporating the variability in the runout parameters as well as the variability in the measured cutting forces. The experimental results indicate that the model predicts the cutting force profile with good accuracy using a 95% confidence interval. To conclude, this dissertation is the research drawing attention to model enhancement, which has considerable impacts on modeling, design, and optimization of various processes and systems. The fundamental methodologies of model enhancement are developed and further applied to various applications. These research activities developed engineering compliant models for adequate system predictions based on observational data with complex variable relationships and uncertainty, which facilitate process planning, monitoring, and real-time control.
Machine Learning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chikkagoudar, Satish; Chatterjee, Samrat; Thomas, Dennis G.

The absence of a robust and unified theory of cyber dynamics presents challenges and opportunities for using machine learning based data-driven approaches to further the understanding of the behavior of such complex systems. Analysts can also use machine learning approaches to gain operational insights. In order to be operationally beneficial, cybersecurity machine learning based models need to have the ability to: (1) represent a real-world system, (2) infer system properties, and (3) learn and adapt based on expert knowledge and observations. Probabilistic models and Probabilistic graphical models provide these necessary properties and are further explored in this chapter. Bayesian Networksmore » and Hidden Markov Models are introduced as an example of a widely used data driven classification/modeling strategy.« less
Can We Practically Bring Physics-based Modeling Into Operational Analytics Tools?

DOE Office of Scientific and Technical Information (OSTI.GOV)

Granderson, Jessica; Bonvini, Marco; Piette, Mary Ann

We present that analytics software is increasingly used to improve and maintain operational efficiency in commercial buildings. Energy managers, owners, and operators are using a diversity of commercial offerings often referred to as Energy Information Systems, Fault Detection and Diagnostic (FDD) systems, or more broadly Energy Management and Information Systems, to cost-effectively enable savings on the order of ten to twenty percent. Most of these systems use data from meters and sensors, with rule-based and/or data-driven models to characterize system and building behavior. In contrast, physics-based modeling uses first-principles and engineering models (e.g., efficiency curves) to characterize system and buildingmore » behavior. Historically, these physics-based approaches have been used in the design phase of the building life cycle or in retrofit analyses. Researchers have begun exploring the benefits of integrating physics-based models with operational data analytics tools, bridging the gap between design and operations. In this paper, we detail the development and operator use of a software tool that uses hybrid data-driven and physics-based approaches to cooling plant FDD and optimization. Specifically, we describe the system architecture, models, and FDD and optimization algorithms; advantages and disadvantages with respect to purely data-driven approaches; and practical implications for scaling and replicating these techniques. Finally, we conclude with an evaluation of the future potential for such tools and future research opportunities.« less
A Systems' Biology Approach to Study MicroRNA-Mediated Gene Regulatory Networks

PubMed Central

Kunz, Manfred; Vera, Julio; Wolkenhauer, Olaf

2013-01-01

MicroRNAs (miRNAs) are potent effectors in gene regulatory networks where aberrant miRNA expression can contribute to human diseases such as cancer. For a better understanding of the regulatory role of miRNAs in coordinating gene expression, we here present a systems biology approach combining data-driven modeling and model-driven experiments. Such an approach is characterized by an iterative process, including biological data acquisition and integration, network construction, mathematical modeling and experimental validation. To demonstrate the application of this approach, we adopt it to investigate mechanisms of collective repression on p21 by multiple miRNAs. We first construct a p21 regulatory network based on data from the literature and further expand it using algorithms that predict molecular interactions. Based on the network structure, a detailed mechanistic model is established and its parameter values are determined using data. Finally, the calibrated model is used to study the effect of different miRNA expression profiles and cooperative target regulation on p21 expression levels in different biological contexts. PMID:24350286
Data-driven non-linear elasticity: constitutive manifold construction and problem discretization

NASA Astrophysics Data System (ADS)

Ibañez, Ruben; Borzacchiello, Domenico; Aguado, Jose Vicente; Abisset-Chavanne, Emmanuelle; Cueto, Elias; Ladeveze, Pierre; Chinesta, Francisco

2017-11-01

The use of constitutive equations calibrated from data has been implemented into standard numerical solvers for successfully addressing a variety problems encountered in simulation-based engineering sciences (SBES). However, the complexity remains constantly increasing due to the need of increasingly detailed models as well as the use of engineered materials. Data-Driven simulation constitutes a potential change of paradigm in SBES. Standard simulation in computational mechanics is based on the use of two very different types of equations. The first one, of axiomatic character, is related to balance laws (momentum, mass, energy,\\ldots ), whereas the second one consists of models that scientists have extracted from collected, either natural or synthetic, data. Data-driven (or data-intensive) simulation consists of directly linking experimental data to computers in order to perform numerical simulations. These simulations will employ laws, universally recognized as epistemic, while minimizing the need of explicit, often phenomenological, models. The main drawback of such an approach is the large amount of required data, some of them inaccessible from the nowadays testing facilities. Such difficulty can be circumvented in many cases, and in any case alleviated, by considering complex tests, collecting as many data as possible and then using a data-driven inverse approach in order to generate the whole constitutive manifold from few complex experimental tests, as discussed in the present work.
Design and experiment of data-driven modeling and flutter control of a prototype wing

NASA Astrophysics Data System (ADS)

Lum, Kai-Yew; Xu, Cai-Lin; Lu, Zhenbo; Lai, Kwok-Leung; Cui, Yongdong

2017-06-01

This paper presents an approach for data-driven modeling of aeroelasticity and its application to flutter control design of a wind-tunnel wing model. Modeling is centered on system identification of unsteady aerodynamic loads using computational fluid dynamics data, and adopts a nonlinear multivariable extension of the Hammerstein-Wiener system. The formulation is in modal coordinates of the elastic structure, and yields a reduced-order model of the aeroelastic feedback loop that is parametrized by airspeed. Flutter suppression is thus cast as a robust stabilization problem over uncertain airspeed, for which a low-order H∞ controller is computed. The paper discusses in detail parameter sensitivity and observability of the model, the former to justify the chosen model structure, and the latter to provide a criterion for physical sensor placement. Wind tunnel experiments confirm the validity of the modeling approach and the effectiveness of the control design.
Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.

PubMed

Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A

2006-10-15

Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.
Geoscience Meets Social Science: A Flexible Data Driven Approach for Developing High Resolution Population Datasets at Global Scale

NASA Astrophysics Data System (ADS)

Rose, A.; McKee, J.; Weber, E.; Bhaduri, B. L.

2017-12-01

Leveraging decades of expertise in population modeling, and in response to growing demand for higher resolution population data, Oak Ridge National Laboratory is now generating LandScan HD at global scale. LandScan HD is conceived as a 90m resolution population distribution where modeling is tailored to the unique geography and data conditions of individual countries or regions by combining social, cultural, physiographic, and other information with novel geocomputation methods. Similarities among these areas are exploited in order to leverage existing training data and machine learning algorithms to rapidly scale development. Drawing on ORNL's unique set of capabilities, LandScan HD adapts highly mature population modeling methods developed for LandScan Global and LandScan USA, settlement mapping research and production in high-performance computing (HPC) environments, land use and neighborhood mapping through image segmentation, and facility-specific population density models. Adopting a flexible methodology to accommodate different geographic areas, LandScan HD accounts for the availability, completeness, and level of detail of relevant ancillary data. Beyond core population and mapped settlement inputs, these factors determine the model complexity for an area, requiring that for any given area, a data-driven model could support either a simple top-down approach, a more detailed bottom-up approach, or a hybrid approach.
Data-Driven H∞ Control for Nonlinear Distributed Parameter Systems.

PubMed

Luo, Biao; Huang, Tingwen; Wu, Huai-Ning; Yang, Xiong

2015-11-01

The data-driven H∞ control problem of nonlinear distributed parameter systems is considered in this paper. An off-policy learning method is developed to learn the H∞ control policy from real system data rather than the mathematical model. First, Karhunen-Loève decomposition is used to compute the empirical eigenfunctions, which are then employed to derive a reduced-order model (ROM) of slow subsystem based on the singular perturbation theory. The H∞ control problem is reformulated based on the ROM, which can be transformed to solve the Hamilton-Jacobi-Isaacs (HJI) equation, theoretically. To learn the solution of the HJI equation from real system data, a data-driven off-policy learning approach is proposed based on the simultaneous policy update algorithm and its convergence is proved. For implementation purpose, a neural network (NN)- based action-critic structure is developed, where a critic NN and two action NNs are employed to approximate the value function, control, and disturbance policies, respectively. Subsequently, a least-square NN weight-tuning rule is derived with the method of weighted residuals. Finally, the developed data-driven off-policy learning approach is applied to a nonlinear diffusion-reaction process, and the obtained results demonstrate its effectiveness.
A Hybrid Physics-Based Data-Driven Approach for Point-Particle Force Modeling

NASA Astrophysics Data System (ADS)

Moore, Chandler; Akiki, Georges; Balachandar, S.

2017-11-01

This study improves upon the physics-based pairwise interaction extended point-particle (PIEP) model. The PIEP model leverages a physical framework to predict fluid mediated interactions between solid particles. While the PIEP model is a powerful tool, its pairwise assumption leads to increased error in flows with high particle volume fractions. To reduce this error, a regression algorithm is used to model the differences between the current PIEP model's predictions and the results of direct numerical simulations (DNS) for an array of monodisperse solid particles subjected to various flow conditions. The resulting statistical model and the physical PIEP model are superimposed to construct a hybrid, physics-based data-driven PIEP model. It must be noted that the performance of a pure data-driven approach without the model-form provided by the physical PIEP model is substantially inferior. The hybrid model's predictive capabilities are analyzed using more DNS. In every case tested, the hybrid PIEP model's prediction are more accurate than those of physical PIEP model. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1315138 and the U.S. DOE, NNSA, ASC Program, as a Cooperative Agreement under Contract No. DE-NA0002378.
Scenario driven data modelling: a method for integrating diverse sources of data and data streams

PubMed Central

2011-01-01

Background Biology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways. Results This paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements. Conclusions We provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting massive data streams into usable knowledge. PMID:22165854
Parameterized data-driven fuzzy model based optimal control of a semi-batch reactor.

PubMed

Kamesh, Reddi; Rani, K Yamuna

2016-09-01

A parameterized data-driven fuzzy (PDDF) model structure is proposed for semi-batch processes, and its application for optimal control is illustrated. The orthonormally parameterized input trajectories, initial states and process parameters are the inputs to the model, which predicts the output trajectories in terms of Fourier coefficients. Fuzzy rules are formulated based on the signs of a linear data-driven model, while the defuzzification step incorporates a linear regression model to shift the domain from input to output domain. The fuzzy model is employed to formulate an optimal control problem for single rate as well as multi-rate systems. Simulation study on a multivariable semi-batch reactor system reveals that the proposed PDDF modeling approach is capable of capturing the nonlinear and time-varying behavior inherent in the semi-batch system fairly accurately, and the results of operating trajectory optimization using the proposed model are found to be comparable to the results obtained using the exact first principles model, and are also found to be comparable to or better than parameterized data-driven artificial neural network model based optimization results. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

Linear dynamical modes as new variables for data-driven ENSO forecast

NASA Astrophysics Data System (ADS)

Gavrilov, Andrey; Seleznev, Aleksei; Mukhin, Dmitry; Loskutov, Evgeny; Feigin, Alexander; Kurths, Juergen

2018-05-01

A new data-driven model for analysis and prediction of spatially distributed time series is proposed. The model is based on a linear dynamical mode (LDM) decomposition of the observed data which is derived from a recently developed nonlinear dimensionality reduction approach. The key point of this approach is its ability to take into account simple dynamical properties of the observed system by means of revealing the system's dominant time scales. The LDMs are used as new variables for empirical construction of a nonlinear stochastic evolution operator. The method is applied to the sea surface temperature anomaly field in the tropical belt where the El Nino Southern Oscillation (ENSO) is the main mode of variability. The advantage of LDMs versus traditionally used empirical orthogonal function decomposition is demonstrated for this data. Specifically, it is shown that the new model has a competitive ENSO forecast skill in comparison with the other existing ENSO models.
A Data-Driven Approach to Interactive Visualization of Power Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Jun

Driven by emerging industry standards, electric utilities and grid coordination organizations are eager to seek advanced tools to assist grid operators to perform mission-critical tasks and enable them to make quick and accurate decisions. The emerging field of visual analytics holds tremendous promise for improving the business practices in today’s electric power industry. The conducted investigation, however, has revealed that the existing commercial power grid visualization tools heavily rely on human designers, hindering user’s ability to discover. Additionally, for a large grid, it is very labor-intensive and costly to build and maintain the pre-designed visual displays. This project proposes amore » data-driven approach to overcome the common challenges. The proposed approach relies on developing powerful data manipulation algorithms to create visualizations based on the characteristics of empirically or mathematically derived data. The resulting visual presentations emphasize what the data is rather than how the data should be presented, thus fostering comprehension and discovery. Furthermore, the data-driven approach formulates visualizations on-the-fly. It does not require a visualization design stage, completely eliminating or significantly reducing the cost for building and maintaining visual displays. The research and development (R&D) conducted in this project is mainly divided into two phases. The first phase (Phase I & II) focuses on developing data driven techniques for visualization of power grid and its operation. Various data-driven visualization techniques were investigated, including pattern recognition for auto-generation of one-line diagrams, fuzzy model based rich data visualization for situational awareness, etc. The R&D conducted during the second phase (Phase IIB) focuses on enhancing the prototyped data driven visualization tool based on the gathered requirements and use cases. The goal is to evolve the prototyped tool developed during the first phase into a commercial grade product. We will use one of the identified application areas as an example to demonstrate how research results achieved in this project are successfully utilized to address an emerging industry need. In summary, the data-driven visualization approach developed in this project has proven to be promising for building the next-generation power grid visualization tools. Application of this approach has resulted in a state-of-the-art commercial tool currently being leveraged by more than 60 utility organizations in North America and Europe .« less
Real-time Social Internet Data to Guide Forecasting Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Del Valle, Sara Y.

Our goal is to improve decision support by monitoring and forecasting events using social media, mathematical models, and quantifying model uncertainty. Our approach is real-time, data-driven forecasts with quantified uncertainty: Not just for weather anymore. Information flow from human observations of events through an Internet system and classification algorithms is used to produce quantitatively uncertain forecast. In summary, we want to develop new tools to extract useful information from Internet data streams, develop new approaches to assimilate real-time information into predictive models, validate approaches by forecasting events, and our ultimate goal is to develop an event forecasting system using mathematicalmore » approaches and heterogeneous data streams.« less
ODISEES: Ontology-Driven Interactive Search Environment for Earth Sciences

NASA Technical Reports Server (NTRS)

Rutherford, Matthew T.; Huffer, Elisabeth B.; Kusterer, John M.; Quam, Brandi M.

2015-01-01

This paper discusses the Ontology-driven Interactive Search Environment for Earth Sciences (ODISEES) project currently being developed to aid researchers attempting to find usable data among an overabundance of closely related data. ODISEES' ontological structure relies on a modular, adaptable concept modeling approach, which allows the domain to be modeled more or less as it is without worrying about terminology or external requirements. In the model, variables are individually assigned semantic content based on the characteristics of the measurements they represent, allowing intuitive discovery and comparison of data without requiring the user to sift through large numbers of data sets and variables to find the desired information.
A review of surrogate models and their application to groundwater modeling

NASA Astrophysics Data System (ADS)

Asher, M. J.; Croke, B. F. W.; Jakeman, A. J.; Peeters, L. J. M.

2015-08-01

The spatially and temporally variable parameters and inputs to complex groundwater models typically result in long runtimes which hinder comprehensive calibration, sensitivity, and uncertainty analysis. Surrogate modeling aims to provide a simpler, and hence faster, model which emulates the specified output of a more complex model in function of its inputs and parameters. In this review paper, we summarize surrogate modeling techniques in three categories: data-driven, projection, and hierarchical-based approaches. Data-driven surrogates approximate a groundwater model through an empirical model that captures the input-output mapping of the original model. Projection-based models reduce the dimensionality of the parameter space by projecting the governing equations onto a basis of orthonormal vectors. In hierarchical or multifidelity methods the surrogate is created by simplifying the representation of the physical system, such as by ignoring certain processes, or reducing the numerical resolution. In discussing the application to groundwater modeling of these methods, we note several imbalances in the existing literature: a large body of work on data-driven approaches seemingly ignores major drawbacks to the methods; only a fraction of the literature focuses on creating surrogates to reproduce outputs of fully distributed groundwater models, despite these being ubiquitous in practice; and a number of the more advanced surrogate modeling methods are yet to be fully applied in a groundwater modeling context.
Intercomparisons of Prognostic, Diagnostic, and Inversion Modeling Approaches for Estimation of Net Ecosystem Exchange over the Pacific Northwest Region

NASA Astrophysics Data System (ADS)

Turner, D. P.; Jacobson, A. R.; Nemani, R. R.

2013-12-01

The recent development of large spatially-explicit datasets for multiple variables relevant to monitoring terrestrial carbon flux offers the opportunity to estimate the terrestrial land flux using several alternative, potentially complimentary, approaches. Here we developed and compared regional estimates of net ecosystem exchange (NEE) over the Pacific Northwest region of the U.S. using three approaches. In the prognostic modeling approach, the process-based Biome-BGC model was driven by distributed meteorological station data and was informed by Landsat-based coverages of forest stand age and disturbance regime. In the diagnostic modeling approach, the quasi-mechanistic CFLUX model estimated net ecosystem production (NEP) by upscaling eddy covariance flux tower observations. The model was driven by distributed climate data and MODIS FPAR (the fraction of incident PAR that is absorbed by the vegetation canopy). It was informed by coarse resolution (1 km) data about forest stand age. In both the prognostic and diagnostic modeling approaches, emissions estimates for biomass burning, harvested products, and river/stream evasion were added to model-based NEP to get NEE. The inversion model (CarbonTracker) relied on observations of atmospheric CO2 concentration to optimize prior surface carbon flux estimates. The Pacific Northwest is heterogeneous with respect to land cover and forest management, and repeated surveys of forest inventory plots support the presence of a strong regional carbon sink. The diagnostic model suggested a stronger carbon sink than the prognostic model, and a much larger sink that the inversion model. The introduction of Landsat data on disturbance history served to reduce uncertainty with respect to regional NEE in the diagnostic and prognostic modeling approaches. The FPAR data was particularly helpful in capturing the seasonality of the carbon flux using the diagnostic modeling approach. The inversion approach took advantage of a global network of CO2 observation stations, but had difficulty resolving regional fluxes such as that in the PNW given the still sparse nature of the CO2 measurement network.
Data driven approaches vs. qualitative approaches in climate change impact and vulnerability assessment.

NASA Astrophysics Data System (ADS)

Zebisch, Marc; Schneiderbauer, Stefan; Petitta, Marcello

2015-04-01

In the last decade the scope of climate change science has broadened significantly. 15 years ago the focus was mainly on understanding climate change, providing climate change scenarios and giving ideas about potential climate change impacts. Today, adaptation to climate change has become an increasingly important field of politics and one role of science is to inform and consult this process. Therefore, climate change science is not anymore focusing on data driven approaches only (such as climate or climate impact models) but is progressively applying and relying on qualitative approaches including opinion and expertise acquired through interactive processes with local stakeholders and decision maker. Furthermore, climate change science is facing the challenge of normative questions, such us 'how important is a decrease of yield in a developed country where agriculture only represents 3% of the GDP and the supply with agricultural products is strongly linked to global markets and less depending on local production?'. In this talk we will present examples from various applied research and consultancy projects on climate change vulnerabilities including data driven methods (e.g. remote sensing and modelling) to semi-quantitative and qualitative assessment approaches. Furthermore, we will discuss bottlenecks, pitfalls and opportunities in transferring climate change science to policy and decision maker oriented climate services.
Data-driven Analysis and Prediction of Arctic Sea Ice

NASA Astrophysics Data System (ADS)

Kondrashov, D. A.; Chekroun, M.; Ghil, M.; Yuan, X.; Ting, M.

2015-12-01

We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales.This approach is applied to monthly time series of leading principal components from the multivariate Empirical Orthogonal Function decomposition of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" forup to 6-months ahead. It will be shown in particular that the memory effects included in our non-Markovian linear MSM models improve predictions of large-amplitude SIC anomalies in certain Arctic regions. Furtherimprovements allowed by the MSM framework will adopt a nonlinear formulation, as well as alternative data-adaptive decompositions.
Framework for developing hybrid process-driven, artificial neural network and regression models for salinity prediction in river systems

NASA Astrophysics Data System (ADS)

Hunter, Jason M.; Maier, Holger R.; Gibbs, Matthew S.; Foale, Eloise R.; Grosvenor, Naomi A.; Harders, Nathan P.; Kikuchi-Miller, Tahali C.

2018-05-01

Salinity modelling in river systems is complicated by a number of processes, including in-stream salt transport and various mechanisms of saline accession that vary dynamically as a function of water level and flow, often at different temporal scales. Traditionally, salinity models in rivers have either been process- or data-driven. The primary problem with process-based models is that in many instances, not all of the underlying processes are fully understood or able to be represented mathematically. There are also often insufficient historical data to support model development. The major limitation of data-driven models, such as artificial neural networks (ANNs) in comparison, is that they provide limited system understanding and are generally not able to be used to inform management decisions targeting specific processes, as different processes are generally modelled implicitly. In order to overcome these limitations, a generic framework for developing hybrid process and data-driven models of salinity in river systems is introduced and applied in this paper. As part of the approach, the most suitable sub-models are developed for each sub-process affecting salinity at the location of interest based on consideration of model purpose, the degree of process understanding and data availability, which are then combined to form the hybrid model. The approach is applied to a 46 km reach of the Murray River in South Australia, which is affected by high levels of salinity. In this reach, the major processes affecting salinity include in-stream salt transport, accession of saline groundwater along the length of the reach and the flushing of three waterbodies in the floodplain during overbank flows of various magnitudes. Based on trade-offs between the degree of process understanding and data availability, a process-driven model is developed for in-stream salt transport, an ANN model is used to model saline groundwater accession and three linear regression models are used to account for the flushing of the different floodplain storages. The resulting hybrid model performs very well on approximately 3 years of daily validation data, with a Nash-Sutcliffe efficiency (NSE) of 0.89 and a root mean squared error (RMSE) of 12.62 mg L-1 (over a range from approximately 50 to 250 mg L-1). Each component of the hybrid model results in noticeable improvements in model performance corresponding to the range of flows for which they are developed. The predictive performance of the hybrid model is significantly better than that of a benchmark process-driven model (NSE = -0.14, RMSE = 41.10 mg L-1, Gbench index = 0.90) and slightly better than that of a benchmark data-driven (ANN) model (NSE = 0.83, RMSE = 15.93 mg L-1, Gbench index = 0.36). Apart from improved predictive performance, the hybrid model also has advantages over the ANN benchmark model in terms of increased capacity for improving system understanding and greater ability to support management decisions.
Research activities at the Australian Bureau of Meteorology for the regional ionospheric specification and forecasting

NASA Astrophysics Data System (ADS)

Bouya, Zahra; Terkildsen, Michael

2016-07-01

The Australian Space Forecast Centre (ASFC) provides space weather forecasts to a diverse group of customers. Space Weather Services (SWS) within the Australian Bureau of Meteorology is focussed both on developing tailored products and services for the key customer groups, and supporting ASFC operations. Research in SWS is largely centred on the development of data-driven models using a range of solar-terrestrial data. This paper will cover some data requirements , approaches and recent SWS activities for data driven modelling with a focus on the regional Ionospheric specification and forecasting.
Hourly runoff forecasting for flood risk management: Application of various computational intelligence models

NASA Astrophysics Data System (ADS)

Badrzadeh, Honey; Sarukkalige, Ranjan; Jayawardena, A. W.

2015-10-01

Reliable river flow forecasts play a key role in flood risk mitigation. Among different approaches of river flow forecasting, data driven approaches have become increasingly popular in recent years due to their minimum information requirements and ability to simulate nonlinear and non-stationary characteristics of hydrological processes. In this study, attempts are made to apply four different types of data driven approaches, namely traditional artificial neural networks (ANN), adaptive neuro-fuzzy inference systems (ANFIS), wavelet neural networks (WNN), and, hybrid ANFIS with multi resolution analysis using wavelets (WNF). Developed models applied for real time flood forecasting at Casino station on Richmond River, Australia which is highly prone to flooding. Hourly rainfall and runoff data were used to drive the models which have been used for forecasting with 1, 6, 12, 24, 36 and 48 h lead-time. The performance of models further improved by adding an upstream river flow data (Wiangaree station), as another effective input. All models perform satisfactorily up to 12 h lead-time. However, the hybrid wavelet-based models significantly outperforming the ANFIS and ANN models in the longer lead-time forecasting. The results confirm the robustness of the proposed structure of the hybrid models for real time runoff forecasting in the study area.
Integrating geo web services for a user driven exploratory analysis

NASA Astrophysics Data System (ADS)

Moncrieff, Simon; Turdukulov, Ulanbek; Gulland, Elizabeth-Kate

2016-04-01

In data exploration, several online data sources may need to be dynamically aggregated or summarised over spatial region, time interval, or set of attributes. With respect to thematic data, web services are mainly used to present results leading to a supplier driven service model limiting the exploration of the data. In this paper we propose a user need driven service model based on geo web processing services. The aim of the framework is to provide a method for the scalable and interactive access to various geographic data sources on the web. The architecture combines a data query, processing technique and visualisation methodology to rapidly integrate and visually summarise properties of a dataset. We illustrate the environment on a health related use case that derives Age Standardised Rate - a dynamic index that needs integration of the existing interoperable web services of demographic data in conjunction with standalone non-spatial secure database servers used in health research. Although the example is specific to the health field, the architecture and the proposed approach are relevant and applicable to other fields that require integration and visualisation of geo datasets from various web services and thus, we believe is generic in its approach.
Model‐Based Approach to Predict Adherence to Protocol During Antiobesity Trials

PubMed Central

Sharma, Vishnu D.; Combes, François P.; Vakilynejad, Majid; Lahu, Gezim; Lesko, Lawrence J.

2017-01-01

Abstract Development of antiobesity drugs is continuously challenged by high dropout rates during clinical trials. The objective was to develop a population pharmacodynamic model that describes the temporal changes in body weight, considering disease progression, lifestyle intervention, and drug effects. Markov modeling (MM) was applied for quantification and characterization of responder and nonresponder as key drivers of dropout rates, to ultimately support the clinical trial simulations and the outcome in terms of trial adherence. Subjects (n = 4591) from 6 Contrave® trials were included in this analysis. An indirect‐response model developed by van Wart et al was used as a starting point. Inclusion of drug effect was dose driven using a population dose‐ and time‐dependent pharmacodynamic (DTPD) model. Additionally, a population‐pharmacokinetic parameter‐ and data (PPPD)‐driven model was developed using the final DTPD model structure and final parameter estimates from a previously developed population pharmacokinetic model based on available Contrave® pharmacokinetic concentrations. Last, MM was developed to predict transition rate probabilities among responder, nonresponder, and dropout states driven by the pharmacodynamic effect resulting from the DTPD or PPPD model. Covariates included in the models and parameters were diabetes mellitus and race. The linked DTPD‐MM and PPPD‐MM was able to predict transition rates among responder, nonresponder, and dropout states well. The analysis concluded that body‐weight change is an important factor influencing dropout rates, and the MM depicted that overall a DTPD model‐driven approach provides a reasonable prediction of clinical trial outcome probabilities similar to a pharmacokinetic‐driven approach. PMID:28858397
Using connectome-based predictive modeling to predict individual behavior from brain connectivity

PubMed Central

Shen, Xilin; Finn, Emily S.; Scheinost, Dustin; Rosenberg, Monica D.; Chun, Marvin M.; Papademetris, Xenophon; Constable, R Todd

2017-01-01

Neuroimaging is a fast developing research area where anatomical and functional images of human brains are collected using techniques such as functional magnetic resonance imaging (fMRI), diffusion tensor imaging (DTI), and electroencephalography (EEG). Technical advances and large-scale datasets have allowed for the development of models capable of predicting individual differences in traits and behavior using brain connectivity measures derived from neuroimaging data. Here, we present connectome-based predictive modeling (CPM), a data-driven protocol for developing predictive models of brain-behavior relationships from connectivity data using cross-validation. This protocol includes the following steps: 1) feature selection, 2) feature summarization, 3) model building, and 4) assessment of prediction significance. We also include suggestions for visualizing the most predictive features (i.e., brain connections). The final result should be a generalizable model that takes brain connectivity data as input and generates predictions of behavioral measures in novel subjects, accounting for a significant amount of the variance in these measures. It has been demonstrated that the CPM protocol performs equivalently or better than most of the existing approaches in brain-behavior prediction. However, because CPM focuses on linear modeling and a purely data-driven driven approach, neuroscientists with limited or no experience in machine learning or optimization would find it easy to implement the protocols. Depending on the volume of data to be processed, the protocol can take 10–100 minutes for model building, 1–48 hours for permutation testing, and 10–20 minutes for visualization of results. PMID:28182017
Imaging plus X: multimodal models of neurodegenerative disease.

PubMed

Oxtoby, Neil P; Alexander, Daniel C

2017-08-01

This article argues that the time is approaching for data-driven disease modelling to take centre stage in the study and management of neurodegenerative disease. The snowstorm of data now available to the clinician defies qualitative evaluation; the heterogeneity of data types complicates integration through traditional statistical methods; and the large datasets becoming available remain far from the big-data sizes necessary for fully data-driven machine-learning approaches. The recent emergence of data-driven disease progression models provides a balance between imposed knowledge of disease features and patterns learned from data. The resulting models are both predictive of disease progression in individual patients and informative in terms of revealing underlying biological patterns. Largely inspired by observational models, data-driven disease progression models have emerged in the last few years as a feasible means for understanding the development of neurodegenerative diseases. These models have revealed insights into frontotemporal dementia, Huntington's disease, multiple sclerosis, Parkinson's disease and other conditions. For example, event-based models have revealed finer graded understanding of progression patterns; self-modelling regression and differential equation models have provided data-driven biomarker trajectories; spatiotemporal models have shown that brain shape changes, for example of the hippocampus, can occur before detectable neurodegeneration; and network models have provided some support for prion-like mechanistic hypotheses of disease propagation. The most mature results are in sporadic Alzheimer's disease, in large part because of the availability of the Alzheimer's disease neuroimaging initiative dataset. Results generally support the prevailing amyloid-led hypothetical model of Alzheimer's disease, while revealing finer detail and insight into disease progression. The emerging field of disease progression modelling provides a natural mechanism to integrate different kinds of information, for example from imaging, serum and cerebrospinal fluid markers and cognitive tests, to obtain new insights into progressive diseases. Such insights include fine-grained longitudinal patterns of neurodegeneration, from early stages, and the heterogeneity of these trajectories over the population. More pragmatically, such models enable finer precision in patient staging and stratification, prediction of progression rates and earlier and better identification of at-risk individuals. We argue that this will make disease progression modelling invaluable for recruitment and end-points in future clinical trials, potentially ameliorating the high failure rate in trials of, e.g., Alzheimer's disease therapies. We review the state of the art in these techniques and discuss the future steps required to translate the ideas to front-line application.
Data-driven Inference and Investigation of Thermosphere Dynamics and Variations

NASA Astrophysics Data System (ADS)

Mehta, P. M.; Linares, R.

2017-12-01

This paper presents a methodology for data-driven inference and investigation of thermosphere dynamics and variations. The approach uses data-driven modal analysis to extract the most energetic modes of variations for neutral thermospheric species using proper orthogonal decomposition, where the time-independent modes or basis represent the dynamics and the time-depedent coefficients or amplitudes represent the model parameters. The data-driven modal analysis approach combined with sparse, discrete observations is used to infer amplitues for the dynamic modes and to calibrate the energy content of the system. In this work, two different data-types, namely the number density measurements from TIMED/GUVI and the mass density measurements from CHAMP/GRACE are simultaneously ingested for an accurate and self-consistent specification of the thermosphere. The assimilation process is achieved with a non-linear least squares solver and allows estimation/tuning of the model parameters or amplitudes rather than the driver. In this work, we use the Naval Research Lab's MSIS model to derive the most energetic modes for six different species, He, O, N2, O2, H, and N. We examine the dominant drivers of variations for helium in MSIS and observe that seasonal latitudinal variation accounts for about 80% of the dynamic energy with a strong preference of helium for the winter hemisphere. We also observe enhanced helium presence near the poles at GRACE altitudes during periods of low solar activity (Feb 2007) as previously deduced. We will also examine the storm-time response of helium derived from observations. The results are expected to be useful in tuning/calibration of the physics-based models.
Data-driven outbreak forecasting with a simple nonlinear growth model

PubMed Central

Lega, Joceline; Brown, Heidi E.

2016-01-01

Recent events have thrown the spotlight on infectious disease outbreak response. We developed a data-driven method, EpiGro, which can be applied to cumulative case reports to estimate the order of magnitude of the duration, peak and ultimate size of an ongoing outbreak. It is based on a surprisingly simple mathematical property of many epidemiological data sets, does not require knowledge or estimation of disease transmission parameters, is robust to noise and to small data sets, and runs quickly due to its mathematical simplicity. Using data from historic and ongoing epidemics, we present the model. We also provide modeling considerations that justify this approach and discuss its limitations. In the absence of other information or in conjunction with other models, EpiGro may be useful to public health responders. PMID:27770752
Incremental checking of Master Data Management model based on contextual graphs

NASA Astrophysics Data System (ADS)

Lamolle, Myriam; Menet, Ludovic; Le Duc, Chan

2015-10-01

The validation of models is a crucial step in distributed heterogeneous systems. In this paper, an incremental validation method is proposed in the scope of a Model Driven Engineering (MDE) approach, which is used to develop a Master Data Management (MDM) field represented by XML Schema models. The MDE approach presented in this paper is based on the definition of an abstraction layer using UML class diagrams. The validation method aims to minimise the model errors and to optimisethe process of model checking. Therefore, the notion of validation contexts is introduced allowing the verification of data model views. Description logics specify constraints that the models have to check. An experimentation of the approach is presented through an application developed in ArgoUML IDE.
A comparison of Data Driven models of solving the task of gender identification of author in Russian language texts for cases without and with the gender deception

NASA Astrophysics Data System (ADS)

Sboev, A.; Moloshnikov, I.; Gudovskikh, D.; Rybka, R.

2017-12-01

In this work we compare several data-driven approaches to the task of author’s gender identification for texts with or without gender imitation. The data corpus has been specially gathered with crowdsourcing for this task. The best models are convolutional neural network with input of morphological data (fl-measure: 88%±3) for texts without imitation, and gradient boosting model with vector of character n-grams frequencies as input data (f1-measure: 64% ± 3) for texts with gender imitation. The method to filter the crowdsourced corpus using limited reference sample of texts to increase the accuracy of result is discussed.
Lessons learned while integrating habitat, dispersal, disturbance, and life-history traits into species habitat models under climate change

Treesearch

Louis R. Iverson; Anantha M. Prasad; Stephen N. Matthews; Matthew P. Peters

2011-01-01

We present an approach to modeling potential climate-driven changes in habitat for tree and bird species in the eastern United States. First, we took an empirical-statistical modeling approach, using randomForest, with species abundance data from national inventories combined with soil, climate, and landscape variables, to build abundance-based habitat models for 134...

Dynamic Data Driven Methods for Self-aware Aerospace Vehicles

DTIC Science & Technology

2015-04-08

structural response model that incorporates multiple degradation or failure modes including damaged panel strength (BVID, thru- hole ), damaged panel...stiffness (BVID, thru- hole ), loose fastener, fretted fastener hole , and disbonded surface. • A new data-driven approach for the online updating of the flight...between the first and second plies. The panels were reinforced around the boarders of the panel with through holes to simulate mounting the wing skins to
Simplified subsurface modelling: data assimilation and violated model assumptions

NASA Astrophysics Data System (ADS)

Erdal, Daniel; Lange, Natascha; Neuweiler, Insa

2017-04-01

Integrated models are gaining more and more attention in hydrological modelling as they can better represent the interaction between different compartments. Naturally, these models come along with larger numbers of unknowns and requirements on computational resources compared to stand-alone models. If large model domains are to be represented, e.g. on catchment scale, the resolution of the numerical grid needs to be reduced or the model itself needs to be simplified. Both approaches lead to a reduced ability to reproduce the present processes. This lack of model accuracy may be compensated by using data assimilation methods. In these methods observations are used to update the model states, and optionally model parameters as well, in order to reduce the model error induced by the imposed simplifications. What is unclear is whether these methods combined with strongly simplified models result in completely data-driven models or if they can even be used to make adequate predictions of the model state for times when no observations are available. In the current work we consider the combined groundwater and unsaturated zone, which can be modelled in a physically consistent way using 3D-models solving the Richards equation. For use in simple predictions, however, simpler approaches may be considered. The question investigated here is whether a simpler model, in which the groundwater is modelled as a horizontal 2D-model and the unsaturated zones as a few sparse 1D-columns, can be used within an Ensemble Kalman filter to give predictions of groundwater levels and unsaturated fluxes. This is tested under conditions where the feedback between the two model-compartments are large (e.g. shallow groundwater table) and the simplification assumptions are clearly violated. Such a case may be a steep hill-slope or pumping wells, creating lateral fluxes in the unsaturated zone, or strong heterogeneous structures creating unaccounted flows in both the saturated and unsaturated compartments. Under such circumstances, direct modelling using a simplified model will not provide good results. However, a more data driven (e.g. grey box) approach, driven by the filter, may still provide an improved understanding of the system. Comparisons between full 3D simulations and simplified filter driven models will be shown and the resulting benefits and drawbacks will be discussed.
Multidimensional Data Modeling for Business Process Analysis

NASA Astrophysics Data System (ADS)

Mansmann, Svetlana; Neumuth, Thomas; Scholl, Marc H.

The emerging area of business process intelligence attempts to enhance the analytical capabilities of business process management systems by employing data warehousing and mining technologies. This paper presents an approach to re-engineering the business process modeling in conformity with the multidimensional data model. Since the business process and the multidimensional model are driven by rather different objectives and assumptions, there is no straightforward solution to converging these models.
Catchments as non-linear filters: evaluating data-driven approaches for spatio-temporal predictions in ungauged basins

NASA Astrophysics Data System (ADS)

Bellugi, D. G.; Tennant, C.; Larsen, L.

2016-12-01

Catchment and climate heterogeneity complicate prediction of runoff across time and space, and resulting parameter uncertainty can lead to large accumulated errors in hydrologic models, particularly in ungauged basins. Recently, data-driven modeling approaches have been shown to avoid the accumulated uncertainty associated with many physically-based models, providing an appealing alternative for hydrologic prediction. However, the effectiveness of different methods in hydrologically and geomorphically distinct catchments, and the robustness of these methods to changing climate and changing hydrologic processes remain to be tested. Here, we evaluate the use of machine learning techniques to predict daily runoff across time and space using only essential climatic forcing (e.g. precipitation, temperature, and potential evapotranspiration) time series as model input. Model training and testing was done using a high quality dataset of daily runoff and climate forcing data for 25+ years for 600+ minimally-disturbed catchments (drainage area range 5-25,000 km2, median size 336 km2) that cover a wide range of climatic and physical characteristics. Preliminary results using Support Vector Regression (SVR) suggest that in some catchments this nonlinear-based regression technique can accurately predict daily runoff, while the same approach fails in other catchments, indicating that the representation of climate inputs and/or catchment filter characteristics in the model structure need further refinement to increase performance. We bolster this analysis by using Sparse Identification of Nonlinear Dynamics (a sparse symbolic regression technique) to uncover the governing equations that describe runoff processes in catchments where SVR performed well and for ones where it performed poorly, thereby enabling inference about governing processes. This provides a robust means of examining how catchment complexity influences runoff prediction skill, and represents a contribution towards the integration of data-driven inference and physically-based models.
Knowledge-guided golf course detection using a convolutional neural network fine-tuned on temporally augmented data

NASA Astrophysics Data System (ADS)

Chen, Jingbo; Wang, Chengyi; Yue, Anzhi; Chen, Jiansheng; He, Dongxu; Zhang, Xiuyan

2017-10-01

The tremendous success of deep learning models such as convolutional neural networks (CNNs) in computer vision provides a method for similar problems in the field of remote sensing. Although research on repurposing pretrained CNN to remote sensing tasks is emerging, the scarcity of labeled samples and the complexity of remote sensing imagery still pose challenges. We developed a knowledge-guided golf course detection approach using a CNN fine-tuned on temporally augmented data. The proposed approach is a combination of knowledge-driven region proposal, data-driven detection based on CNN, and knowledge-driven postprocessing. To confront data complexity, knowledge-derived cooccurrence, composition, and area-based rules are applied sequentially to propose candidate golf regions. To confront sample scarcity, we employed data augmentation in the temporal domain, which extracts samples from multitemporal images. The augmented samples were then used to fine-tune a pretrained CNN for golf detection. Finally, commission error was further suppressed by postprocessing. Experiments conducted on GF-1 imagery prove the effectiveness of the proposed approach.
Data to Decisions: Creating a Culture of Model-Driven Drug Discovery.

PubMed

Brown, Frank K; Kopti, Farida; Chang, Charlie Zhenyu; Johnson, Scott A; Glick, Meir; Waller, Chris L

2017-09-01

Merck & Co., Inc., Kenilworth, NJ, USA, is undergoing a transformation in the way that it prosecutes R&D programs. Through the adoption of a "model-driven" culture, enhanced R&D productivity is anticipated, both in the form of decreased attrition at each stage of the process and by providing a rational framework for understanding and learning from the data generated along the way. This new approach focuses on the concept of a "Design Cycle" that makes use of all the data possible, internally and externally, to drive decision-making. These data can take the form of bioactivity, 3D structures, genomics, pathway, PK/PD, safety data, etc. Synthesis of high-quality data into models utilizing both well-established and cutting-edge methods has been shown to yield high confidence predictions to prioritize decision-making and efficiently reposition resources within R&D. The goal is to design an adaptive research operating plan that uses both modeled data and experiments, rather than just testing, to drive project decision-making. To support this emerging culture, an ambitious information management (IT) program has been initiated to implement a harmonized platform to facilitate the construction of cross-domain workflows to enable data-driven decision-making and the construction and validation of predictive models. These goals are achieved through depositing model-ready data, agile persona-driven access to data, a unified cross-domain predictive model lifecycle management platform, and support for flexible scientist-developed workflows that simplify data manipulation and consume model services. The end-to-end nature of the platform, in turn, not only supports but also drives the culture change by enabling scientists to apply predictive sciences throughout their work and over the lifetime of a project. This shift in mindset for both scientists and IT was driven by an early impactful demonstration of the potential benefits of the platform, in which expert-level early discovery predictive models were made available from familiar desktop tools, such as ChemDraw. This was built using a workflow-driven service-oriented architecture (SOA) on top of the rigorous registration of all underlying model entities.
Model-Based Approach to Predict Adherence to Protocol During Antiobesity Trials.

PubMed

Sharma, Vishnu D; Combes, François P; Vakilynejad, Majid; Lahu, Gezim; Lesko, Lawrence J; Trame, Mirjam N

2018-02-01

Development of antiobesity drugs is continuously challenged by high dropout rates during clinical trials. The objective was to develop a population pharmacodynamic model that describes the temporal changes in body weight, considering disease progression, lifestyle intervention, and drug effects. Markov modeling (MM) was applied for quantification and characterization of responder and nonresponder as key drivers of dropout rates, to ultimately support the clinical trial simulations and the outcome in terms of trial adherence. Subjects (n = 4591) from 6 Contrave ® trials were included in this analysis. An indirect-response model developed by van Wart et al was used as a starting point. Inclusion of drug effect was dose driven using a population dose- and time-dependent pharmacodynamic (DTPD) model. Additionally, a population-pharmacokinetic parameter- and data (PPPD)-driven model was developed using the final DTPD model structure and final parameter estimates from a previously developed population pharmacokinetic model based on available Contrave ® pharmacokinetic concentrations. Last, MM was developed to predict transition rate probabilities among responder, nonresponder, and dropout states driven by the pharmacodynamic effect resulting from the DTPD or PPPD model. Covariates included in the models and parameters were diabetes mellitus and race. The linked DTPD-MM and PPPD-MM was able to predict transition rates among responder, nonresponder, and dropout states well. The analysis concluded that body-weight change is an important factor influencing dropout rates, and the MM depicted that overall a DTPD model-driven approach provides a reasonable prediction of clinical trial outcome probabilities similar to a pharmacokinetic-driven approach. © 2017, The Authors. The Journal of Clinical Pharmacology published by Wiley Periodicals, Inc. on behalf of American College of Clinical Pharmacology.
Data-Driven Modeling and Prediction of Arctic Sea Ice

NASA Astrophysics Data System (ADS)

Kondrashov, Dmitri; Chekroun, Mickael; Ghil, Michael

2016-04-01

We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to probabilistic prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales. This approach is applied to monthly time series of state-of-the-art data-adaptive decompositions of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" for up to 6-months ahead. It will be shown in particular that the memory effects included intrinsically in the formulation of our non-Markovian MSM models allow for improvements of the prediction skill of large-amplitude SIC anomalies in certain Arctic regions on the one hand, and of September Sea Ice Extent, on the other. Further improvements allowed by the MSM framework will adopt a nonlinear formulation and explore next-generation data-adaptive decompositions, namely modification of Principal Oscillation Patterns (POPs) and rotated Multichannel Singular Spectrum Analysis (M-SSA).
Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method.

PubMed

Zhang, Huaguang; Cui, Lili; Zhang, Xin; Luo, Yanhong

2011-12-01

In this paper, a novel data-driven robust approximate optimal tracking control scheme is proposed for unknown general nonlinear systems by using the adaptive dynamic programming (ADP) method. In the design of the controller, only available input-output data is required instead of known system dynamics. A data-driven model is established by a recurrent neural network (NN) to reconstruct the unknown system dynamics using available input-output data. By adding a novel adjustable term related to the modeling error, the resultant modeling error is first guaranteed to converge to zero. Then, based on the obtained data-driven model, the ADP method is utilized to design the approximate optimal tracking controller, which consists of the steady-state controller and the optimal feedback controller. Further, a robustifying term is developed to compensate for the NN approximation errors introduced by implementing the ADP method. Based on Lyapunov approach, stability analysis of the closed-loop system is performed to show that the proposed controller guarantees the system state asymptotically tracking the desired trajectory. Additionally, the obtained control input is proven to be close to the optimal control input within a small bound. Finally, two numerical examples are used to demonstrate the effectiveness of the proposed control scheme.
Universal Fragment Descriptors for Predicting Electronic and Mechanical Properties of Inorganic Crystals

NASA Astrophysics Data System (ADS)

Oses, Corey; Isayev, Olexandr; Toher, Cormac; Curtarolo, Stefano; Tropsha, Alexander

Historically, materials discovery is driven by a laborious trial-and-error process. The growth of materials databases and emerging informatics approaches finally offer the opportunity to transform this practice into data- and knowledge-driven rational design-accelerating discovery of novel materials exhibiting desired properties. By using data from the AFLOW repository for high-throughput, ab-initio calculations, we have generated Quantitative Materials Structure-Property Relationship (QMSPR) models to predict critical materials properties, including the metal/insulator classification, band gap energy, and bulk modulus. The prediction accuracy obtained with these QMSPR models approaches training data for virtually any stoichiometric inorganic crystalline material. We attribute the success and universality of these models to the construction of new materials descriptors-referred to as the universal Property-Labeled Material Fragments (PLMF). This representation affords straightforward model interpretation in terms of simple heuristic design rules that could guide rational materials design. This proof-of-concept study demonstrates the power of materials informatics to dramatically accelerate the search for new materials.
Knowledge-driven genomic interactions: an application in ovarian cancer.

PubMed

Kim, Dokyoon; Li, Ruowang; Dudek, Scott M; Frase, Alex T; Pendergrass, Sarah A; Ritchie, Marylyn D

2014-01-01

Effective cancer clinical outcome prediction for understanding of the mechanism of various types of cancer has been pursued using molecular-based data such as gene expression profiles, an approach that has promise for providing better diagnostics and supporting further therapies. However, clinical outcome prediction based on gene expression profiles varies between independent data sets. Further, single-gene expression outcome prediction is limited for cancer evaluation since genes do not act in isolation, but rather interact with other genes in complex signaling or regulatory networks. In addition, since pathways are more likely to co-operate together, it would be desirable to incorporate expert knowledge to combine pathways in a useful and informative manner. Thus, we propose a novel approach for identifying knowledge-driven genomic interactions and applying it to discover models associated with cancer clinical phenotypes using grammatical evolution neural networks (GENN). In order to demonstrate the utility of the proposed approach, an ovarian cancer data from the Cancer Genome Atlas (TCGA) was used for predicting clinical stage as a pilot project. We identified knowledge-driven genomic interactions associated with cancer stage from single knowledge bases such as sources of pathway-pathway interaction, but also knowledge-driven genomic interactions across different sets of knowledge bases such as pathway-protein family interactions by integrating different types of information. Notably, an integration model from different sources of biological knowledge achieved 78.82% balanced accuracy and outperformed the top models with gene expression or single knowledge-based data types alone. Furthermore, the results from the models are more interpretable because they are framed in the context of specific biological pathways or other expert knowledge. The success of the pilot study we have presented herein will allow us to pursue further identification of models predictive of clinical cancer survival and recurrence. Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different biological knowledge sources has the potential for providing more effective screening strategies and therapeutic targets for many types of cancer.
Model-driven approach to data collection and reporting for quality improvement

PubMed Central

Curcin, Vasa; Woodcock, Thomas; Poots, Alan J.; Majeed, Azeem; Bell, Derek

2014-01-01

Continuous data collection and analysis have been shown essential to achieving improvement in healthcare. However, the data required for local improvement initiatives are often not readily available from hospital Electronic Health Record (EHR) systems or not routinely collected. Furthermore, improvement teams are often restricted in time and funding thus requiring inexpensive and rapid tools to support their work. Hence, the informatics challenge in healthcare local improvement initiatives consists of providing a mechanism for rapid modelling of the local domain by non-informatics experts, including performance metric definitions, and grounded in established improvement techniques. We investigate the feasibility of a model-driven software approach to address this challenge, whereby an improvement model designed by a team is used to automatically generate required electronic data collection instruments and reporting tools. To that goal, we have designed a generic Improvement Data Model (IDM) to capture the data items and quality measures relevant to the project, and constructed Web Improvement Support in Healthcare (WISH), a prototype tool that takes user-generated IDM models and creates a data schema, data collection web interfaces, and a set of live reports, based on Statistical Process Control (SPC) for use by improvement teams. The software has been successfully used in over 50 improvement projects, with more than 700 users. We present in detail the experiences of one of those initiatives, Chronic Obstructive Pulmonary Disease project in Northwest London hospitals. The specific challenges of improvement in healthcare are analysed and the benefits and limitations of the approach are discussed. PMID:24874182
Condom Use among Immigrant Latino Sexual Minorities: Multilevel Analysis after Respondent-Driven Sampling

PubMed Central

Rhodes, Scott D.; McCoy, Thomas P.

2014-01-01

This study explored correlates of condom use within a respondent-driven sample of 190 Spanish-speaking immigrant Latino sexual minorities, including gay and bisexual men, other men who have sex with men (MSM), and transgender person, in North Carolina. Five analytic approaches for modeling data collected using respondent-driven sampling (RDS) were compared. Across most approaches, knowledge of HIV and sexually transmitted infections (STIs) and increased condom use self-efficacy predicted consistent condom use and increased homophobia predicted decreased consistent condom use. The same correlates were not significant in all analyses but were consistent in most. Clustering due to recruitment chains was low, while clustering due to recruiter was substantial. This highlights the importance accounting for clustering when analyzing RDS data. PMID:25646728
Process-Structure Linkages Using a Data Science Approach: Application to Simulated Additive Manufacturing Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Popova, Evdokia; Rodgers, Theron M.; Gong, Xinyi

A novel data science workflow is developed and demonstrated to extract process-structure linkages (i.e., reduced-order model) for microstructure evolution problems when the final microstructure depends on (simulation or experimental) processing parameters. Our workflow consists of four main steps: data pre-processing, microstructure quantification, dimensionality reduction, and extraction/validation of process-structure linkages. These methods that can be employed within each step vary based on the type and amount of available data. In this paper, this data-driven workflow is applied to a set of synthetic additive manufacturing microstructures obtained using the Potts-kinetic Monte Carlo (kMC) approach. Additive manufacturing techniques inherently produce complex microstructures thatmore » can vary significantly with processing conditions. Using the developed workflow, a low-dimensional data-driven model was established to correlate process parameters with the predicted final microstructure. In addition, the modular workflows developed and presented in this work facilitate easy dissemination and curation by the broader community.« less
Process-Structure Linkages Using a Data Science Approach: Application to Simulated Additive Manufacturing Data

DOE PAGES

Popova, Evdokia; Rodgers, Theron M.; Gong, Xinyi; ...

2017-03-13

A novel data science workflow is developed and demonstrated to extract process-structure linkages (i.e., reduced-order model) for microstructure evolution problems when the final microstructure depends on (simulation or experimental) processing parameters. Our workflow consists of four main steps: data pre-processing, microstructure quantification, dimensionality reduction, and extraction/validation of process-structure linkages. These methods that can be employed within each step vary based on the type and amount of available data. In this paper, this data-driven workflow is applied to a set of synthetic additive manufacturing microstructures obtained using the Potts-kinetic Monte Carlo (kMC) approach. Additive manufacturing techniques inherently produce complex microstructures thatmore » can vary significantly with processing conditions. Using the developed workflow, a low-dimensional data-driven model was established to correlate process parameters with the predicted final microstructure. In addition, the modular workflows developed and presented in this work facilitate easy dissemination and curation by the broader community.« less
Research needs for developing a commodity-driven freight modeling approach.

DOT National Transportation Integrated Search

2003-01-01

It is well known that better freight forecasting models and data are needed, but the literature does not clearly indicate which components of the modeling methodology are most in need of improvement, which is a critical need in an era of limited rese...
Data-driven modeling reveals cell behaviors controlling self-organization during Myxococcus xanthus development

PubMed Central

Cotter, Christopher R.; Schüttler, Heinz-Bernd; Igoshin, Oleg A.; Shimkets, Lawrence J.

2017-01-01

Collective cell movement is critical to the emergent properties of many multicellular systems, including microbial self-organization in biofilms, embryogenesis, wound healing, and cancer metastasis. However, even the best-studied systems lack a complete picture of how diverse physical and chemical cues act upon individual cells to ensure coordinated multicellular behavior. Known for its social developmental cycle, the bacterium Myxococcus xanthus uses coordinated movement to generate three-dimensional aggregates called fruiting bodies. Despite extensive progress in identifying genes controlling fruiting body development, cell behaviors and cell–cell communication mechanisms that mediate aggregation are largely unknown. We developed an approach to examine emergent behaviors that couples fluorescent cell tracking with data-driven models. A unique feature of this approach is the ability to identify cell behaviors affecting the observed aggregation dynamics without full knowledge of the underlying biological mechanisms. The fluorescent cell tracking revealed large deviations in the behavior of individual cells. Our modeling method indicated that decreased cell motility inside the aggregates, a biased walk toward aggregate centroids, and alignment among neighboring cells in a radial direction to the nearest aggregate are behaviors that enhance aggregation dynamics. Our modeling method also revealed that aggregation is generally robust to perturbations in these behaviors and identified possible compensatory mechanisms. The resulting approach of directly combining behavior quantification with data-driven simulations can be applied to more complex systems of collective cell movement without prior knowledge of the cellular machinery and behavioral cues. PMID:28533367
Supervised dictionary learning for inferring concurrent brain networks.

PubMed

Zhao, Shijie; Han, Junwei; Lv, Jinglei; Jiang, Xi; Hu, Xintao; Zhao, Yu; Ge, Bao; Guo, Lei; Liu, Tianming

2015-10-01

Task-based fMRI (tfMRI) has been widely used to explore functional brain networks via predefined stimulus paradigm in the fMRI scan. Traditionally, the general linear model (GLM) has been a dominant approach to detect task-evoked networks. However, GLM focuses on task-evoked or event-evoked brain responses and possibly ignores the intrinsic brain functions. In comparison, dictionary learning and sparse coding methods have attracted much attention recently, and these methods have shown the promise of automatically and systematically decomposing fMRI signals into meaningful task-evoked and intrinsic concurrent networks. Nevertheless, two notable limitations of current data-driven dictionary learning method are that the prior knowledge of task paradigm is not sufficiently utilized and that the establishment of correspondences among dictionary atoms in different brains have been challenging. In this paper, we propose a novel supervised dictionary learning and sparse coding method for inferring functional networks from tfMRI data, which takes both of the advantages of model-driven method and data-driven method. The basic idea is to fix the task stimulus curves as predefined model-driven dictionary atoms and only optimize the other portion of data-driven dictionary atoms. Application of this novel methodology on the publicly available human connectome project (HCP) tfMRI datasets has achieved promising results.
Evaluating data distribution and drift vulnerabilities of machine learning algorithms in secure and adversarial environments

NASA Astrophysics Data System (ADS)

Nelson, Kevin; Corbin, George; Blowers, Misty

2014-05-01

Machine learning is continuing to gain popularity due to its ability to solve problems that are difficult to model using conventional computer programming logic. Much of the current and past work has focused on algorithm development, data processing, and optimization. Lately, a subset of research has emerged which explores issues related to security. This research is gaining traction as systems employing these methods are being applied to both secure and adversarial environments. One of machine learning's biggest benefits, its data-driven versus logic-driven approach, is also a weakness if the data on which the models rely are corrupted. Adversaries could maliciously influence systems which address drift and data distribution changes using re-training and online learning. Our work is focused on exploring the resilience of various machine learning algorithms to these data-driven attacks. In this paper, we present our initial findings using Monte Carlo simulations, and statistical analysis, to explore the maximal achievable shift to a classification model, as well as the required amount of control over the data.
Data-driven approach to human motion modeling with Lua and gesture description language

NASA Astrophysics Data System (ADS)

Hachaj, Tomasz; Koptyra, Katarzyna; Ogiela, Marek R.

2017-03-01

The aim of this paper is to present the novel proposition of the human motion modelling and recognition approach that enables real time MoCap signal evaluation. By motions (actions) recognition we mean classification. The role of this approach is to propose the syntactic description procedure that can be easily understood, learnt and used in various motion modelling and recognition tasks in all MoCap systems no matter if they are vision or wearable sensor based. To do so we have prepared extension of Gesture Description Language (GDL) methodology that enables movements description and real-time recognition so that it can use not only positional coordinates of body joints but virtually any type of discreetly measured output MoCap signals like accelerometer, magnetometer or gyroscope. We have also prepared and evaluated the cross-platform implementation of this approach using Lua scripting language and JAVA technology. This implementation is called Data Driven GDL (DD-GDL). In tested scenarios the average execution speed is above 100 frames per second which is an acquisition time of many popular MoCap solutions.

Power-Law Modeling of Cancer Cell Fates Driven by Signaling Data to Reveal Drug Effects

PubMed Central

Zhang, Fan; Wu, Min; Kwoh, Chee Keong; Zheng, Jie

2016-01-01

Extracellular signals are captured and transmitted by signaling proteins inside a cell. An important type of cellular responses to the signals is the cell fate decision, e.g., apoptosis. However, the underlying mechanisms of cell fate regulation are still unclear, thus comprehensive and detailed kinetic models are not yet available. Alternatively, data-driven models are promising to bridge signaling data with the phenotypic measurements of cell fates. The traditional linear model for data-driven modeling of signaling pathways has its limitations because it assumes that the a cell fate is proportional to the activities of signaling proteins, which is unlikely in the complex biological systems. Therefore, we propose a power-law model to relate the activities of all the measured signaling proteins to the probabilities of cell fates. In our experiments, we compared our nonlinear power-law model with the linear model on three cancer datasets with phosphoproteomics and cell fate measurements, which demonstrated that the nonlinear model has superior performance on cell fates prediction. By in silico simulation of virtual protein knock-down, the proposed model is able to reveal drug effects which can complement traditional approaches such as binding affinity analysis. Moreover, our model is able to capture cell line specific information to distinguish one cell line from another in cell fate prediction. Our results show that the power-law data-driven model is able to perform better in cell fate prediction and provide more insights into the signaling pathways for cancer cell fates than the linear model. PMID:27764199
Classroom Strategies Coaching Model: Integration of Formative Assessment and Instructional Coaching

ERIC Educational Resources Information Center

Reddy, Linda A.; Dudek, Christopher M.; Lekwa, Adam

2017-01-01

This article describes the theory, key components, and empirical support for the Classroom Strategies Coaching (CSC) Model, a data-driven coaching approach that systematically integrates data from multiple observations to identify teacher practice needs and goals, design practice plans, and evaluate progress towards goals. The primary aim of the…
Microbial Community Metabolic Modeling: A Community Data-Driven Network Reconstruction: COMMUNITY DATA-DRIVEN METABOLIC NETWORK MODELING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henry, Christopher S.; Bernstein, Hans C.; Weisenhorn, Pamela

Metabolic network modeling of microbial communities provides an in-depth understanding of community-wide metabolic and regulatory processes. Compared to single organism analyses, community metabolic network modeling is more complex because it needs to account for interspecies interactions. To date, most approaches focus on reconstruction of high-quality individual networks so that, when combined, they can predict community behaviors as a result of interspecies interactions. However, this conventional method becomes ineffective for communities whose members are not well characterized and cannot be experimentally interrogated in isolation. Here, we tested a new approach that uses community-level data as a critical input for the networkmore » reconstruction process. This method focuses on directly predicting interspecies metabolic interactions in a community, when axenic information is insufficient. We validated our method through the case study of a bacterial photoautotroph-heterotroph consortium that was used to provide data needed for a community-level metabolic network reconstruction. Resulting simulations provided experimentally validated predictions of how a photoautotrophic cyanobacterium supports the growth of an obligate heterotrophic species by providing organic carbon and nitrogen sources.« less
Data-Driven Haptic Modeling and Rendering of Viscoelastic and Frictional Responses of Deformable Objects.

PubMed

Yim, Sunghoon; Jeon, Seokhee; Choi, Seungmoon

2016-01-01

In this paper, we present an extended data-driven haptic rendering method capable of reproducing force responses during pushing and sliding interaction on a large surface area. The main part of the approach is a novel input variable set for the training of an interpolation model, which incorporates the position of a proxy - an imaginary contact point on the undeformed surface. This allows us to estimate friction in both sliding and sticking states in a unified framework. Estimating the proxy position is done in real-time based on simulation using a sliding yield surface - a surface defining a border between the sliding and sticking regions in the external force space. During modeling, the sliding yield surface is first identified via an automated palpation procedure. Then, through manual palpation on a target surface, input data and resultant force data are acquired. The data are used to build a radial basis interpolation model. During rendering, this input-output mapping interpolation model is used to estimate force responses in real-time in accordance with the interaction input. Physical performance evaluation demonstrates that our approach achieves reasonably high estimation accuracy. A user study also shows plausible perceptual realism under diverse and extensive exploration.
Reverse engineering systems models of regulation: discovery, prediction and mechanisms.

PubMed

Ashworth, Justin; Wurtmann, Elisabeth J; Baliga, Nitin S

2012-08-01

Biological systems can now be understood in comprehensive and quantitative detail using systems biology approaches. Putative genome-scale models can be built rapidly based upon biological inventories and strategic system-wide molecular measurements. Current models combine statistical associations, causative abstractions, and known molecular mechanisms to explain and predict quantitative and complex phenotypes. This top-down 'reverse engineering' approach generates useful organism-scale models despite noise and incompleteness in data and knowledge. Here we review and discuss the reverse engineering of biological systems using top-down data-driven approaches, in order to improve discovery, hypothesis generation, and the inference of biological properties. Copyright © 2011 Elsevier Ltd. All rights reserved.
Data-Driven Robust RVFLNs Modeling of a Blast Furnace Iron-Making Process Using Cauchy Distribution Weighted M-Estimation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Ping; Lv, Youbin; Wang, Hong

Optimal operation of a practical blast furnace (BF) ironmaking process depends largely on a good measurement of molten iron quality (MIQ) indices. However, measuring the MIQ online is not feasible using the available techniques. In this paper, a novel data-driven robust modeling is proposed for online estimation of MIQ using improved random vector functional-link networks (RVFLNs). Since the output weights of traditional RVFLNs are obtained by the least squares approach, a robustness problem may occur when the training dataset is contaminated with outliers. This affects the modeling accuracy of RVFLNs. To solve this problem, a Cauchy distribution weighted M-estimation basedmore » robust RFVLNs is proposed. Since the weights of different outlier data are properly determined by the Cauchy distribution, their corresponding contribution on modeling can be properly distinguished. Thus robust and better modeling results can be achieved. Moreover, given that the BF is a complex nonlinear system with numerous coupling variables, the data-driven canonical correlation analysis is employed to identify the most influential components from multitudinous factors that affect the MIQ indices to reduce the model dimension. Finally, experiments using industrial data and comparative studies have demonstrated that the obtained model produces a better modeling and estimating accuracy and stronger robustness than other modeling methods.« less
Modeling urban coastal flood severity from crowd-sourced flood reports using Poisson regression and Random Forest

NASA Astrophysics Data System (ADS)

Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.

2018-04-01

Sea level rise has already caused more frequent and severe coastal flooding and this trend will likely continue. Flood prediction is an essential part of a coastal city's capacity to adapt to and mitigate this growing problem. Complex coastal urban hydrological systems however, do not always lend themselves easily to physically-based flood prediction approaches. This paper presents a method for using a data-driven approach to estimate flood severity in an urban coastal setting using crowd-sourced data, a non-traditional but growing data source, along with environmental observation data. Two data-driven models, Poisson regression and Random Forest regression, are trained to predict the number of flood reports per storm event as a proxy for flood severity, given extensive environmental data (i.e., rainfall, tide, groundwater table level, and wind conditions) as input. The method is demonstrated using data from Norfolk, Virginia USA from September 2010 to October 2016. Quality-controlled, crowd-sourced street flooding reports ranging from 1 to 159 per storm event for 45 storm events are used to train and evaluate the models. Random Forest performed better than Poisson regression at predicting the number of flood reports and had a lower false negative rate. From the Random Forest model, total cumulative rainfall was by far the most dominant input variable in predicting flood severity, followed by low tide and lower low tide. These methods serve as a first step toward using data-driven methods for spatially and temporally detailed coastal urban flood prediction.
Reconstruction of dynamic image series from undersampled MRI data using data-driven model consistency condition (MOCCO).

PubMed

Velikina, Julia V; Samsonov, Alexey A

2015-11-01

To accelerate dynamic MR imaging through development of a novel image reconstruction technique using low-rank temporal signal models preestimated from training data. We introduce the model consistency condition (MOCCO) technique, which utilizes temporal models to regularize reconstruction without constraining the solution to be low-rank, as is performed in related techniques. This is achieved by using a data-driven model to design a transform for compressed sensing-type regularization. The enforcement of general compliance with the model without excessively penalizing deviating signal allows recovery of a full-rank solution. Our method was compared with a standard low-rank approach utilizing model-based dimensionality reduction in phantoms and patient examinations for time-resolved contrast-enhanced angiography (CE-MRA) and cardiac CINE imaging. We studied the sensitivity of all methods to rank reduction and temporal subspace modeling errors. MOCCO demonstrated reduced sensitivity to modeling errors compared with the standard approach. Full-rank MOCCO solutions showed significantly improved preservation of temporal fidelity and aliasing/noise suppression in highly accelerated CE-MRA (acceleration up to 27) and cardiac CINE (acceleration up to 15) data. MOCCO overcomes several important deficiencies of previously proposed methods based on pre-estimated temporal models and allows high quality image restoration from highly undersampled CE-MRA and cardiac CINE data. © 2014 Wiley Periodicals, Inc.
RECONSTRUCTION OF DYNAMIC IMAGE SERIES FROM UNDERSAMPLED MRI DATA USING DATA-DRIVEN MODEL CONSISTENCY CONDITION (MOCCO)

PubMed Central

Velikina, Julia V.; Samsonov, Alexey A.

2014-01-01

Purpose To accelerate dynamic MR imaging through development of a novel image reconstruction technique using low-rank temporal signal models pre-estimated from training data. Theory We introduce the MOdel Consistency COndition (MOCCO) technique that utilizes temporal models to regularize the reconstruction without constraining the solution to be low-rank as performed in related techniques. This is achieved by using a data-driven model to design a transform for compressed sensing-type regularization. The enforcement of general compliance with the model without excessively penalizing deviating signal allows recovery of a full-rank solution. Methods Our method was compared to standard low-rank approach utilizing model-based dimensionality reduction in phantoms and patient examinations for time-resolved contrast-enhanced angiography (CE MRA) and cardiac CINE imaging. We studied sensitivity of all methods to rank-reduction and temporal subspace modeling errors. Results MOCCO demonstrated reduced sensitivity to modeling errors compared to the standard approach. Full-rank MOCCO solutions showed significantly improved preservation of temporal fidelity and aliasing/noise suppression in highly accelerated CE MRA (acceleration up to 27) and cardiac CINE (acceleration up to 15) data. Conclusions MOCCO overcomes several important deficiencies of previously proposed methods based on pre-estimated temporal models and allows high quality image restoration from highly undersampled CE-MRA and cardiac CINE data. PMID:25399724
Lightweight approach to model traceability in a CASE tool

NASA Astrophysics Data System (ADS)

Vileiniskis, Tomas; Skersys, Tomas; Pavalkis, Saulius; Butleris, Rimantas; Butkiene, Rita

2017-07-01

A term "model-driven" is not at all a new buzzword within the ranks of system development community. Nevertheless, the ever increasing complexity of model-driven approaches keeps fueling all kinds of discussions around this paradigm and pushes researchers forward to research and develop new and more effective ways to system development. With the increasing complexity, model traceability, and model management as a whole, becomes indispensable activities of model-driven system development process. The main goal of this paper is to present a conceptual design and implementation of a practical lightweight approach to model traceability in a CASE tool.
Evaluation of Normalization Methods to Pave the Way Towards Large-Scale LC-MS-Based Metabolomics Profiling Experiments

PubMed Central

Valkenborg, Dirk; Baggerman, Geert; Vanaerschot, Manu; Witters, Erwin; Dujardin, Jean-Claude; Burzykowski, Tomasz; Berg, Maya

2013-01-01

Abstract Combining liquid chromatography-mass spectrometry (LC-MS)-based metabolomics experiments that were collected over a long period of time remains problematic due to systematic variability between LC-MS measurements. Until now, most normalization methods for LC-MS data are model-driven, based on internal standards or intermediate quality control runs, where an external model is extrapolated to the dataset of interest. In the first part of this article, we evaluate several existing data-driven normalization approaches on LC-MS metabolomics experiments, which do not require the use of internal standards. According to variability measures, each normalization method performs relatively well, showing that the use of any normalization method will greatly improve data-analysis originating from multiple experimental runs. In the second part, we apply cyclic-Loess normalization to a Leishmania sample. This normalization method allows the removal of systematic variability between two measurement blocks over time and maintains the differential metabolites. In conclusion, normalization allows for pooling datasets from different measurement blocks over time and increases the statistical power of the analysis, hence paving the way to increase the scale of LC-MS metabolomics experiments. From our investigation, we recommend data-driven normalization methods over model-driven normalization methods, if only a few internal standards were used. Moreover, data-driven normalization methods are the best option to normalize datasets from untargeted LC-MS experiments. PMID:23808607
Using Two Different Approaches to Assess Dietary Patterns: Hypothesis-Driven and Data-Driven Analysis.

PubMed

Previdelli, Ágatha Nogueira; de Andrade, Samantha Caesar; Fisberg, Regina Mara; Marchioni, Dirce Maria

2016-09-23

The use of dietary patterns to assess dietary intake has become increasingly common in nutritional epidemiology studies due to the complexity and multidimensionality of the diet. Currently, two main approaches have been widely used to assess dietary patterns: data-driven and hypothesis-driven analysis. Since the methods explore different angles of dietary intake, using both approaches simultaneously might yield complementary and useful information; thus, we aimed to use both approaches to gain knowledge of adolescents' dietary patterns. Food intake from a cross-sectional survey with 295 adolescents was assessed by 24 h dietary recall (24HR). In hypothesis-driven analysis, based on the American National Cancer Institute method, the usual intake of Brazilian Healthy Eating Index Revised components were estimated. In the data-driven approach, the usual intake of foods/food groups was estimated by the Multiple Source Method. In the results, hypothesis-driven analysis showed low scores for Whole grains, Total vegetables, Total fruit and Whole fruits), while, in data-driven analysis, fruits and whole grains were not presented in any pattern. High intakes of sodium, fats and sugars were observed in hypothesis-driven analysis with low total scores for Sodium, Saturated fat and SoFAA (calories from solid fat, alcohol and added sugar) components in agreement, while the data-driven approach showed the intake of several foods/food groups rich in these nutrients, such as butter/margarine, cookies, chocolate powder, whole milk, cheese, processed meat/cold cuts and candies. In this study, using both approaches at the same time provided consistent and complementary information with regard to assessing the overall dietary habits that will be important in order to drive public health programs, and improve their efficiency to monitor and evaluate the dietary patterns of populations.
A multi-source satellite data approach for modelling Lake Turkana water level: Calibration and validation using satellite altimetry data

USGS Publications Warehouse

Velpuri, N.M.; Senay, G.B.; Asante, K.O.

2012-01-01

Lake Turkana is one of the largest desert lakes in the world and is characterized by high degrees of interand intra-annual fluctuations. The hydrology and water balance of this lake have not been well understood due to its remote location and unavailability of reliable ground truth datasets. Managing surface water resources is a great challenge in areas where in-situ data are either limited or unavailable. In this study, multi-source satellite-driven data such as satellite-based rainfall estimates, modelled runoff, evapotranspiration, and a digital elevation dataset were used to model Lake Turkana water levels from 1998 to 2009. Due to the unavailability of reliable lake level data, an approach is presented to calibrate and validate the water balance model of Lake Turkana using a composite lake level product of TOPEX/Poseidon, Jason-1, and ENVISAT satellite altimetry data. Model validation results showed that the satellitedriven water balance model can satisfactorily capture the patterns and seasonal variations of the Lake Turkana water level fluctuations with a Pearson's correlation coefficient of 0.90 and a Nash-Sutcliffe Coefficient of Efficiency (NSCE) of 0.80 during the validation period (2004-2009). Model error estimates were within 10% of the natural variability of the lake. Our analysis indicated that fluctuations in Lake Turkana water levels are mainly driven by lake inflows and over-the-lake evaporation. Over-the-lake rainfall contributes only up to 30% of lake evaporative demand. During the modelling time period, Lake Turkana showed seasonal variations of 1-2m. The lake level fluctuated in the range up to 4m between the years 1998-2009. This study demonstrated the usefulness of satellite altimetry data to calibrate and validate the satellite-driven hydrological model for Lake Turkana without using any in-situ data. Furthermore, for Lake Turkana, we identified and outlined opportunities and challenges of using a calibrated satellite-driven water balance model for (i) quantitative assessment of the impact of basin developmental activities on lake levels and for (ii) forecasting lake level changes and their impact on fisheries. From this study, we suggest that globally available satellite altimetry data provide a unique opportunity for calibration and validation of hydrologic models in ungauged basins. ?? Author(s) 2012.
Gray-Box Approach for Thermal Modelling of Buildings for Applications in District Heating and Cooling Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saurav, Kumar; Chandan, Vikas

District-heating-and-cooling (DHC) systems are a proven energy solution that has been deployed for many years in a growing number of urban areas worldwide. They comprise a variety of technologies that seek to develop synergies between the production and supply of heat, cooling, domestic hot water and electricity. Although the benefits of DHC systems are significant and have been widely acclaimed, yet the full potential of modern DHC systems remains largely untapped. There are several opportunities for development of energy efficient DHC systems, which will enable the effective exploitation of alternative renewable resources, waste heat recovery, etc., in order to increasemore » the overall efficiency and facilitate the transition towards the next generation of DHC systems. This motivated the need for modelling these complex systems. Large-scale modelling of DHC-networks is challenging, as it has several components such as buildings, pipes, valves, heating source, etc., interacting with each other. In this paper, we focus on building modelling. In particular, we present a gray-box methodology for thermal modelling of buildings. Gray-box modelling is a hybrid of data driven and physics based models where, coefficients of the equations from physics based models are learned using data. This approach allows us to capture the dynamics of the buildings more effectively as compared to pure data driven approach. Additionally, this approach results in a simpler models as compared to pure physics based models. We first develop the individual components of the building such as temperature evolution, flow controller, etc. These individual models are then integrated in to the complete gray-box model for the building. The model is validated using data collected from one of the buildings at Lule{\\aa}, a city on the coast of northern Sweden.« less
Data-driven train set crash dynamics simulation

NASA Astrophysics Data System (ADS)

Tang, Zhao; Zhu, Yunrui; Nie, Yinyu; Guo, Shihui; Liu, Fengjia; Chang, Jian; Zhang, Jianjun

2017-02-01

Traditional finite element (FE) methods are arguably expensive in computation/simulation of the train crash. High computational cost limits their direct applications in investigating dynamic behaviours of an entire train set for crashworthiness design and structural optimisation. On the contrary, multi-body modelling is widely used because of its low computational cost with the trade-off in accuracy. In this study, a data-driven train crash modelling method is proposed to improve the performance of a multi-body dynamics simulation of train set crash without increasing the computational burden. This is achieved by the parallel random forest algorithm, which is a machine learning approach that extracts useful patterns of force-displacement curves and predicts a force-displacement relation in a given collision condition from a collection of offline FE simulation data on various collision conditions, namely different crash velocities in our analysis. Using the FE simulation results as a benchmark, we compared our method with traditional multi-body modelling methods and the result shows that our data-driven method improves the accuracy over traditional multi-body models in train crash simulation and runs at the same level of efficiency.
Boosting Bayesian parameter inference of nonlinear stochastic differential equation models by Hamiltonian scale separation.

PubMed

Albert, Carlo; Ulzega, Simone; Stoop, Ruedi

2016-04-01

Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Cryo-EM Data Are Superior to Contact and Interface Information in Integrative Modeling.

PubMed

de Vries, Sjoerd J; Chauvot de Beauchêne, Isaure; Schindler, Christina E M; Zacharias, Martin

2016-02-23

Protein-protein interactions carry out a large variety of essential cellular processes. Cryo-electron microscopy (cryo-EM) is a powerful technique for the modeling of protein-protein interactions at a wide range of resolutions, and recent developments have caused a revolution in the field. At low resolution, cryo-EM maps can drive integrative modeling of the interaction, assembling existing structures into the map. Other experimental techniques can provide information on the interface or on the contacts between the monomers in the complex. This inevitably raises the question regarding which type of data is best suited to drive integrative modeling approaches. Systematic comparison of the prediction accuracy and specificity of the different integrative modeling paradigms is unavailable to date. Here, we compare EM-driven, interface-driven, and contact-driven integrative modeling paradigms. Models were generated for the protein docking benchmark using the ATTRACT docking engine and evaluated using the CAPRI two-star criterion. At 20 Å resolution, EM-driven modeling achieved a success rate of 100%, outperforming the other paradigms even with perfect interface and contact information. Therefore, even very low resolution cryo-EM data is superior in predicting heterodimeric and heterotrimeric protein assemblies. Our study demonstrates that a force field is not necessary, cryo-EM data alone is sufficient to accurately guide the monomers into place. The resulting rigid models successfully identify regions of conformational change, opening up perspectives for targeted flexible remodeling. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Cryo-EM Data Are Superior to Contact and Interface Information in Integrative Modeling

PubMed Central

de Vries, Sjoerd J.; Chauvot de Beauchêne, Isaure; Schindler, Christina E.M.; Zacharias, Martin

2016-01-01

Protein-protein interactions carry out a large variety of essential cellular processes. Cryo-electron microscopy (cryo-EM) is a powerful technique for the modeling of protein-protein interactions at a wide range of resolutions, and recent developments have caused a revolution in the field. At low resolution, cryo-EM maps can drive integrative modeling of the interaction, assembling existing structures into the map. Other experimental techniques can provide information on the interface or on the contacts between the monomers in the complex. This inevitably raises the question regarding which type of data is best suited to drive integrative modeling approaches. Systematic comparison of the prediction accuracy and specificity of the different integrative modeling paradigms is unavailable to date. Here, we compare EM-driven, interface-driven, and contact-driven integrative modeling paradigms. Models were generated for the protein docking benchmark using the ATTRACT docking engine and evaluated using the CAPRI two-star criterion. At 20 Å resolution, EM-driven modeling achieved a success rate of 100%, outperforming the other paradigms even with perfect interface and contact information. Therefore, even very low resolution cryo-EM data is superior in predicting heterodimeric and heterotrimeric protein assemblies. Our study demonstrates that a force field is not necessary, cryo-EM data alone is sufficient to accurately guide the monomers into place. The resulting rigid models successfully identify regions of conformational change, opening up perspectives for targeted flexible remodeling. PMID:26846888
Bending of Euler-Bernoulli nanobeams based on the strain-driven and stress-driven nonlocal integral models: a numerical approach

NASA Astrophysics Data System (ADS)

Oskouie, M. Faraji; Ansari, R.; Rouhi, H.

2018-04-01

Eringen's nonlocal elasticity theory is extensively employed for the analysis of nanostructures because it is able to capture nanoscale effects. Previous studies have revealed that using the differential form of the strain-driven version of this theory leads to paradoxical results in some cases, such as bending analysis of cantilevers, and recourse must be made to the integral version. In this article, a novel numerical approach is developed for the bending analysis of Euler-Bernoulli nanobeams in the context of strain- and stress-driven integral nonlocal models. This numerical approach is proposed for the direct solution to bypass the difficulties related to converting the integral governing equation into a differential equation. First, the governing equation is derived based on both strain-driven and stress-driven nonlocal models by means of the minimum total potential energy. Also, in each case, the governing equation is obtained in both strong and weak forms. To solve numerically the derived equations, matrix differential and integral operators are constructed based upon the finite difference technique and trapezoidal integration rule. It is shown that the proposed numerical approach can be efficiently applied to the strain-driven nonlocal model with the aim of resolving the mentioned paradoxes. Also, it is able to solve the problem based on the strain-driven model without inconsistencies of the application of this model that are reported in the literature.
Data-driven outbreak forecasting with a simple nonlinear growth model.

PubMed

Lega, Joceline; Brown, Heidi E

2016-12-01

Recent events have thrown the spotlight on infectious disease outbreak response. We developed a data-driven method, EpiGro, which can be applied to cumulative case reports to estimate the order of magnitude of the duration, peak and ultimate size of an ongoing outbreak. It is based on a surprisingly simple mathematical property of many epidemiological data sets, does not require knowledge or estimation of disease transmission parameters, is robust to noise and to small data sets, and runs quickly due to its mathematical simplicity. Using data from historic and ongoing epidemics, we present the model. We also provide modeling considerations that justify this approach and discuss its limitations. In the absence of other information or in conjunction with other models, EpiGro may be useful to public health responders. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

A more rational, theory-driven approach to analysing the factor structure of the Edinburgh Postnatal Depression Scale.

PubMed

Kozinszky, Zoltan; Töreki, Annamária; Hompoth, Emőke A; Dudas, Robert B; Németh, Gábor

2017-04-01

We endeavoured to analyze the factor structure of the Edinburgh Postnatal Depression Scale (EPDS) during a screening programme in Hungary, using exploratory (EFA) and confirmatory factor analysis (CFA), testing both previously published models and newly developed theory-driven ones, after a critical analysis of the literature. Between April 2011 and January 2015, a sample of 2967 pregnant women (between 12th and 30th weeks of gestation) and 714 women 6 weeks after delivery completed the Hungarian version of the EPDS in South-East Hungary. EFAs suggested unidimensionality in both samples. 33 out of 42 previously published models showed good and 6 acceptable fit with our antepartum data in CFAs, whilst 10 of them showed good and 28 acceptable fit in our postpartum sample. Using multiple fit indices, our theory-driven anhedonia (items 1,2) - anxiety (items 4,5) - low mood (items 8,9) model provided the best fit in the antepartum sample. In the postpartum sample, our theory-driven models were again among the best performing models, including an anhedonia and an anxiety factor together with either a low mood or a suicidal risk factor (items 3,6,10). The EPDS showed moderate within- and between-culture invariability, although this would also need to be re-examined with a theory-driven approach. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Model-free information-theoretic approach to infer leadership in pairs of zebrafish.

PubMed

Butail, Sachit; Mwaffo, Violet; Porfiri, Maurizio

2016-04-01

Collective behavior affords several advantages to fish in avoiding predators, foraging, mating, and swimming. Although fish schools have been traditionally considered egalitarian superorganisms, a number of empirical observations suggest the emergence of leadership in gregarious groups. Detecting and classifying leader-follower relationships is central to elucidate the behavioral and physiological causes of leadership and understand its consequences. Here, we demonstrate an information-theoretic approach to infer leadership from positional data of fish swimming. In this framework, we measure social interactions between fish pairs through the mathematical construct of transfer entropy, which quantifies the predictive power of a time series to anticipate another, possibly coupled, time series. We focus on the zebrafish model organism, which is rapidly emerging as a species of choice in preclinical research for its genetic similarity to humans and reduced neurobiological complexity with respect to mammals. To overcome experimental confounds and generate test data sets on which we can thoroughly assess our approach, we adapt and calibrate a data-driven stochastic model of zebrafish motion for the simulation of a coupled dynamical system of zebrafish pairs. In this synthetic data set, the extent and direction of the coupling between the fish are systematically varied across a wide parameter range to demonstrate the accuracy and reliability of transfer entropy in inferring leadership. Our approach is expected to aid in the analysis of collective behavior, providing a data-driven perspective to understand social interactions.
Model-free information-theoretic approach to infer leadership in pairs of zebrafish

NASA Astrophysics Data System (ADS)

Butail, Sachit; Mwaffo, Violet; Porfiri, Maurizio

2016-04-01

Collective behavior affords several advantages to fish in avoiding predators, foraging, mating, and swimming. Although fish schools have been traditionally considered egalitarian superorganisms, a number of empirical observations suggest the emergence of leadership in gregarious groups. Detecting and classifying leader-follower relationships is central to elucidate the behavioral and physiological causes of leadership and understand its consequences. Here, we demonstrate an information-theoretic approach to infer leadership from positional data of fish swimming. In this framework, we measure social interactions between fish pairs through the mathematical construct of transfer entropy, which quantifies the predictive power of a time series to anticipate another, possibly coupled, time series. We focus on the zebrafish model organism, which is rapidly emerging as a species of choice in preclinical research for its genetic similarity to humans and reduced neurobiological complexity with respect to mammals. To overcome experimental confounds and generate test data sets on which we can thoroughly assess our approach, we adapt and calibrate a data-driven stochastic model of zebrafish motion for the simulation of a coupled dynamical system of zebrafish pairs. In this synthetic data set, the extent and direction of the coupling between the fish are systematically varied across a wide parameter range to demonstrate the accuracy and reliability of transfer entropy in inferring leadership. Our approach is expected to aid in the analysis of collective behavior, providing a data-driven perspective to understand social interactions.
Protein-Protein Interface Predictions by Data-Driven Methods: A Review

PubMed Central

Xue, Li C; Dobbs, Drena; Bonvin, Alexandre M.J.J.; Honavar, Vasant

2015-01-01

Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction. PMID:26460190
Development of a Stochastically-driven, Forward Predictive Performance Model for PEMFCs

NASA Astrophysics Data System (ADS)

Harvey, David Benjamin Paul

A one-dimensional multi-scale coupled, transient, and mechanistic performance model for a PEMFC membrane electrode assembly has been developed. The model explicitly includes each of the 5 layers within a membrane electrode assembly and solves for the transport of charge, heat, mass, species, dissolved water, and liquid water. Key features of the model include the use of a multi-step implementation of the HOR reaction on the anode, agglomerate catalyst sub-models for both the anode and cathode catalyst layers, a unique approach that links the composition of the catalyst layer to key properties within the agglomerate model and the implementation of a stochastic input-based approach for component material properties. The model employs a new methodology for validation using statistically varying input parameters and statistically-based experimental performance data; this model represents the first stochastic input driven unit cell performance model. The stochastic input driven performance model was used to identify optimal ionomer content within the cathode catalyst layer, demonstrate the role of material variation in potential low performing MEA materials, provide explanation for the performance of low-Pt loaded MEAs, and investigate the validity of transient-sweep experimental diagnostic methods.
Dynamical prediction of flu seasonality driven by ambient temperature: influenza vs. common cold

NASA Astrophysics Data System (ADS)

Postnikov, Eugene B.

2016-01-01

This work presents a comparative analysis of Influenzanet data for influenza itself and common cold in the Netherlands during the last 5 years, from the point of view of modelling by linearised SIRS equations parametrically driven by the ambient temperature. It is argued that this approach allows for the forecast of common cold, but not of influenza in a strict sense. The difference in their kinetic models is discussed with reference to the clinical background.
Model-driven approach to data collection and reporting for quality improvement.

PubMed

Curcin, Vasa; Woodcock, Thomas; Poots, Alan J; Majeed, Azeem; Bell, Derek

2014-12-01

Continuous data collection and analysis have been shown essential to achieving improvement in healthcare. However, the data required for local improvement initiatives are often not readily available from hospital Electronic Health Record (EHR) systems or not routinely collected. Furthermore, improvement teams are often restricted in time and funding thus requiring inexpensive and rapid tools to support their work. Hence, the informatics challenge in healthcare local improvement initiatives consists of providing a mechanism for rapid modelling of the local domain by non-informatics experts, including performance metric definitions, and grounded in established improvement techniques. We investigate the feasibility of a model-driven software approach to address this challenge, whereby an improvement model designed by a team is used to automatically generate required electronic data collection instruments and reporting tools. To that goal, we have designed a generic Improvement Data Model (IDM) to capture the data items and quality measures relevant to the project, and constructed Web Improvement Support in Healthcare (WISH), a prototype tool that takes user-generated IDM models and creates a data schema, data collection web interfaces, and a set of live reports, based on Statistical Process Control (SPC) for use by improvement teams. The software has been successfully used in over 50 improvement projects, with more than 700 users. We present in detail the experiences of one of those initiatives, Chronic Obstructive Pulmonary Disease project in Northwest London hospitals. The specific challenges of improvement in healthcare are analysed and the benefits and limitations of the approach are discussed. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Assessing Coupled Social Ecological Flood Vulnerability from Uttarakhand, India, to the State of New York with Google Earth Engine

NASA Astrophysics Data System (ADS)

Tellman, B.; Schwarz, B.

2014-12-01

This talk describes the development of a web application to predict and communicate vulnerability to floods given publicly available data, disaster science, and geotech cloud capabilities. The proof of concept in Google Earth Engine API with initial testing on case studies in New York and Utterakhand India demonstrates the potential of highly parallelized cloud computing to model socio-ecological disaster vulnerability at high spatial and temporal resolution and in near real time. Cloud computing facilitates statistical modeling with variables derived from large public social and ecological data sets, including census data, nighttime lights (NTL), and World Pop to derive social parameters together with elevation, satellite imagery, rainfall, and observed flood data from Dartmouth Flood Observatory to derive biophysical parameters. While more traditional, physically based hydrological models that rely on flow algorithms and numerical methods are currently unavailable in parallelized computing platforms like Google Earth Engine, there is high potential to explore "data driven" modeling that trades physics for statistics in a parallelized environment. A data driven approach to flood modeling with geographically weighted logistic regression has been initially tested on Hurricane Irene in southeastern New York. Comparison of model results with observed flood data reveals a 97% accuracy of the model to predict flooded pixels. Testing on multiple storms is required to further validate this initial promising approach. A statistical social-ecological flood model that could produce rapid vulnerability assessments to predict who might require immediate evacuation and where could serve as an early warning. This type of early warning system would be especially relevant in data poor places lacking the computing power, high resolution data such as LiDar and stream gauges, or hydrologic expertise to run physically based models in real time. As the data-driven model presented relies on globally available data, the only real time data input required would be typical data from a weather service, e.g. precipitation or coarse resolution flood prediction. However, model uncertainty will vary locally depending upon the resolution and frequency of observed flood and socio-economic damage impact data.
Sensory processing during viewing of cinematographic material: Computational modeling and functional neuroimaging

PubMed Central

Bordier, Cecile; Puja, Francesco; Macaluso, Emiliano

2013-01-01

The investigation of brain activity using naturalistic, ecologically-valid stimuli is becoming an important challenge for neuroscience research. Several approaches have been proposed, primarily relying on data-driven methods (e.g. independent component analysis, ICA). However, data-driven methods often require some post-hoc interpretation of the imaging results to draw inferences about the underlying sensory, motor or cognitive functions. Here, we propose using a biologically-plausible computational model to extract (multi-)sensory stimulus statistics that can be used for standard hypothesis-driven analyses (general linear model, GLM). We ran two separate fMRI experiments, which both involved subjects watching an episode of a TV-series. In Exp 1, we manipulated the presentation by switching on-and-off color, motion and/or sound at variable intervals, whereas in Exp 2, the video was played in the original version, with all the consequent continuous changes of the different sensory features intact. Both for vision and audition, we extracted stimulus statistics corresponding to spatial and temporal discontinuities of low-level features, as well as a combined measure related to the overall stimulus saliency. Results showed that activity in occipital visual cortex and the superior temporal auditory cortex co-varied with changes of low-level features. Visual saliency was found to further boost activity in extra-striate visual cortex plus posterior parietal cortex, while auditory saliency was found to enhance activity in the superior temporal cortex. Data-driven ICA analyses of the same datasets also identified “sensory” networks comprising visual and auditory areas, but without providing specific information about the possible underlying processes, e.g., these processes could relate to modality, stimulus features and/or saliency. We conclude that the combination of computational modeling and GLM enables the tracking of the impact of bottom–up signals on brain activity during viewing of complex and dynamic multisensory stimuli, beyond the capability of purely data-driven approaches. PMID:23202431
Exploring galaxy evolution with latent space walks

NASA Astrophysics Data System (ADS)

Schawinski, Kevin; Turp, Dennis; Zhang, Ce

2018-01-01

We present a new approach using artificial intelligence to perform data-driven forward models of astrophysical phenomena. We describe how a variational autoencoder can be used to encode galaxies to latent space, independently manipulate properties such as the specific star formation rate, and return it to real space. Such transformations can be used for forward modeling phenomena using data as the only constraints. We demonstrate the utility of this approach using the question of the quenching of star formation in galaxies.
qPortal: A platform for data-driven biomedical research.

PubMed

Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

2018-01-01

Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on high-performance computing systems via coupling of workflow management systems. Integration of project and data management as well as workflow resources in one place present clear advantages over existing solutions.
A Data-Driven Approach to Reverse Engineering Customer Engagement Models: Towards Functional Constructs

PubMed Central

de Vries, Natalie Jane; Carlson, Jamie; Moscato, Pablo

2014-01-01

Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The ‘communities’ of questionnaire items that emerge from our community detection method form possible ‘functional constructs’ inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such ‘functional constructs’ suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling. PMID:25036766
A data-driven approach to reverse engineering customer engagement models: towards functional constructs.

PubMed

de Vries, Natalie Jane; Carlson, Jamie; Moscato, Pablo

2014-01-01

Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The 'communities' of questionnaire items that emerge from our community detection method form possible 'functional constructs' inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such 'functional constructs' suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling.
Flood probability quantification for road infrastructure: Data-driven spatial-statistical approach and case study applications.

PubMed

Kalantari, Zahra; Cavalli, Marco; Cantone, Carolina; Crema, Stefano; Destouni, Georgia

2017-03-01

Climate-driven increase in the frequency of extreme hydrological events is expected to impose greater strain on the built environment and major transport infrastructure, such as roads and railways. This study develops a data-driven spatial-statistical approach to quantifying and mapping the probability of flooding at critical road-stream intersection locations, where water flow and sediment transport may accumulate and cause serious road damage. The approach is based on novel integration of key watershed and road characteristics, including also measures of sediment connectivity. The approach is concretely applied to and quantified for two specific study case examples in southwest Sweden, with documented road flooding effects of recorded extreme rainfall. The novel contributions of this study in combining a sediment connectivity account with that of soil type, land use, spatial precipitation-runoff variability and road drainage in catchments, and in extending the connectivity measure use for different types of catchments, improve the accuracy of model results for road flood probability. Copyright © 2016 Elsevier B.V. All rights reserved.
Heterogeneous postsurgical data analytics for predictive modeling of mortality risks in intensive care units.

PubMed

Yun Chen; Hui Yang

2014-01-01

The rapid advancements of biomedical instrumentation and healthcare technology have resulted in data-rich environments in hospitals. However, the meaningful information extracted from rich datasets is limited. There is a dire need to go beyond current medical practices, and develop data-driven methods and tools that will enable and help (i) the handling of big data, (ii) the extraction of data-driven knowledge, (iii) the exploitation of acquired knowledge for optimizing clinical decisions. This present study focuses on the prediction of mortality rates in Intensive Care Units (ICU) using patient-specific healthcare recordings. It is worth mentioning that postsurgical monitoring in ICU leads to massive datasets with unique properties, e.g., variable heterogeneity, patient heterogeneity, and time asyncronization. To cope with the challenges in ICU datasets, we developed the postsurgical decision support system with a series of analytical tools, including data categorization, data pre-processing, feature extraction, feature selection, and predictive modeling. Experimental results show that the proposed data-driven methodology outperforms traditional approaches and yields better results based on the evaluation of real-world ICU data from 4000 subjects in the database. This research shows great potentials for the use of data-driven analytics to improve the quality of healthcare services.
Moving Beyond ERP Components: A Selective Review of Approaches to Integrate EEG and Behavior

PubMed Central

Bridwell, David A.; Cavanagh, James F.; Collins, Anne G. E.; Nunez, Michael D.; Srinivasan, Ramesh; Stober, Sebastian; Calhoun, Vince D.

2018-01-01

Relationships between neuroimaging measures and behavior provide important clues about brain function and cognition in healthy and clinical populations. While electroencephalography (EEG) provides a portable, low cost measure of brain dynamics, it has been somewhat underrepresented in the emerging field of model-based inference. We seek to address this gap in this article by highlighting the utility of linking EEG and behavior, with an emphasis on approaches for EEG analysis that move beyond focusing on peaks or “components” derived from averaging EEG responses across trials and subjects (generating the event-related potential, ERP). First, we review methods for deriving features from EEG in order to enhance the signal within single-trials. These methods include filtering based on user-defined features (i.e., frequency decomposition, time-frequency decomposition), filtering based on data-driven properties (i.e., blind source separation, BSS), and generating more abstract representations of data (e.g., using deep learning). We then review cognitive models which extract latent variables from experimental tasks, including the drift diffusion model (DDM) and reinforcement learning (RL) approaches. Next, we discuss ways to access associations among these measures, including statistical models, data-driven joint models and cognitive joint modeling using hierarchical Bayesian models (HBMs). We think that these methodological tools are likely to contribute to theoretical advancements, and will help inform our understandings of brain dynamics that contribute to moment-to-moment cognitive function. PMID:29632480
Paving the COWpath: data-driven design of pediatric order sets

PubMed Central

Zhang, Yiye; Padman, Rema; Levin, James E

2014-01-01

Objective Evidence indicates that users incur significant physical and cognitive costs in the use of order sets, a core feature of computerized provider order entry systems. This paper develops data-driven approaches for automating the construction of order sets that match closely with user preferences and workflow while minimizing physical and cognitive workload. Materials and methods We developed and tested optimization-based models embedded with clustering techniques using physical and cognitive click cost criteria. By judiciously learning from users’ actual actions, our methods identify items for constituting order sets that are relevant according to historical ordering data and grouped on the basis of order similarity and ordering time. We evaluated performance of the methods using 47 099 orders from the year 2011 for asthma, appendectomy and pneumonia management in a pediatric inpatient setting. Results In comparison with existing order sets, those developed using the new approach significantly reduce the physical and cognitive workload associated with usage by 14–52%. This approach is also capable of accommodating variations in clinical conditions that affect order set usage and development. Discussion There is a critical need to investigate the cognitive complexity imposed on users by complex clinical information systems, and to design their features according to ‘human factors’ best practices. Optimizing order set generation using cognitive cost criteria introduces a new approach that can potentially improve ordering efficiency, reduce unintended variations in order placement, and enhance patient safety. Conclusions We demonstrate that data-driven methods offer a promising approach for designing order sets that are generalizable, data-driven, condition-based, and up to date with current best practices. PMID:24674844
Computational neuroscience approach to biomarkers and treatments for mental disorders.

PubMed

Yahata, Noriaki; Kasai, Kiyoto; Kawato, Mitsuo

2017-04-01

Psychiatry research has long experienced a stagnation stemming from a lack of understanding of the neurobiological underpinnings of phenomenologically defined mental disorders. Recently, the application of computational neuroscience to psychiatry research has shown great promise in establishing a link between phenomenological and pathophysiological aspects of mental disorders, thereby recasting current nosology in more biologically meaningful dimensions. In this review, we highlight recent investigations into computational neuroscience that have undertaken either theory- or data-driven approaches to quantitatively delineate the mechanisms of mental disorders. The theory-driven approach, including reinforcement learning models, plays an integrative role in this process by enabling correspondence between behavior and disorder-specific alterations at multiple levels of brain organization, ranging from molecules to cells to circuits. Previous studies have explicated a plethora of defining symptoms of mental disorders, including anhedonia, inattention, and poor executive function. The data-driven approach, on the other hand, is an emerging field in computational neuroscience seeking to identify disorder-specific features among high-dimensional big data. Remarkably, various machine-learning techniques have been applied to neuroimaging data, and the extracted disorder-specific features have been used for automatic case-control classification. For many disorders, the reported accuracies have reached 90% or more. However, we note that rigorous tests on independent cohorts are critically required to translate this research into clinical applications. Finally, we discuss the utility of the disorder-specific features found by the data-driven approach to psychiatric therapies, including neurofeedback. Such developments will allow simultaneous diagnosis and treatment of mental disorders using neuroimaging, thereby establishing 'theranostics' for the first time in clinical psychiatry. © 2016 The Authors. Psychiatry and Clinical Neurosciences © 2016 Japanese Society of Psychiatry and Neurology.
AOP-driven Predictive Models for Carcinogenicity: an exercise in interoperable data application.

EPA Science Inventory

Traditional methods and data sources for risk assessment are resource-intensive, retrospective, and not a feasible approach to address the tremendous regulatory burden of unclassified chemicals. As a result, the adverse outcome pathway (AOP) concept was developed to facilitate a ...
The Potential of Knowing More: A Review of Data-Driven Urban Water Management.

PubMed

Eggimann, Sven; Mutzner, Lena; Wani, Omar; Schneider, Mariane Yvonne; Spuhler, Dorothee; Moy de Vitry, Matthew; Beutler, Philipp; Maurer, Max

2017-03-07

The promise of collecting and utilizing large amounts of data has never been greater in the history of urban water management (UWM). This paper reviews several data-driven approaches which play a key role in bringing forward a sea change. It critically investigates whether data-driven UWM offers a promising foundation for addressing current challenges and supporting fundamental changes in UWM. We discuss the examples of better rain-data management, urban pluvial flood-risk management and forecasting, drinking water and sewer network operation and management, integrated design and management, increasing water productivity, wastewater-based epidemiology and on-site water and wastewater treatment. The accumulated evidence from literature points toward a future UWM that offers significant potential benefits thanks to increased collection and utilization of data. The findings show that data-driven UWM allows us to develop and apply novel methods, to optimize the efficiency of the current network-based approach, and to extend functionality of today's systems. However, generic challenges related to data-driven approaches (e.g., data processing, data availability, data quality, data costs) and the specific challenges of data-driven UWM need to be addressed, namely data access and ownership, current engineering practices and the difficulty of assessing the cost benefits of data-driven UWM.

Improved spatial accuracy of functional maps in the rat olfactory bulb using supervised machine learning approach.

PubMed

Murphy, Matthew C; Poplawsky, Alexander J; Vazquez, Alberto L; Chan, Kevin C; Kim, Seong-Gi; Fukuda, Mitsuhiro

2016-08-15

Functional MRI (fMRI) is a popular and important tool for noninvasive mapping of neural activity. As fMRI measures the hemodynamic response, the resulting activation maps do not perfectly reflect the underlying neural activity. The purpose of this work was to design a data-driven model to improve the spatial accuracy of fMRI maps in the rat olfactory bulb. This system is an ideal choice for this investigation since the bulb circuit is well characterized, allowing for an accurate definition of activity patterns in order to train the model. We generated models for both cerebral blood volume weighted (CBVw) and blood oxygen level dependent (BOLD) fMRI data. The results indicate that the spatial accuracy of the activation maps is either significantly improved or at worst not significantly different when using the learned models compared to a conventional general linear model approach, particularly for BOLD images and activity patterns involving deep layers of the bulb. Furthermore, the activation maps computed by CBVw and BOLD data show increased agreement when using the learned models, lending more confidence to their accuracy. The models presented here could have an immediate impact on studies of the olfactory bulb, but perhaps more importantly, demonstrate the potential for similar flexible, data-driven models to improve the quality of activation maps calculated using fMRI data. Copyright © 2016 Elsevier Inc. All rights reserved.
An Adaptation of the Distance Driven Projection Method for Single Pinhole Collimators in SPECT Imaging

NASA Astrophysics Data System (ADS)

Ihsani, Alvin; Farncombe, Troy

2016-02-01

The modelling of the projection operator in tomographic imaging is of critical importance especially when working with algebraic methods of image reconstruction. This paper proposes a distance-driven projection method which is targeted to single-pinhole single-photon emission computed tomograghy (SPECT) imaging since it accounts for the finite size of the pinhole, and the possible tilting of the detector surface in addition to other collimator-specific factors such as geometric sensitivity. The accuracy and execution time of the proposed method is evaluated by comparing to a ray-driven approach where the pinhole is sub-sampled with various sampling schemes. A point-source phantom whose projections were generated using OpenGATE was first used to compare the resolution of reconstructed images with each method using the full width at half maximum (FWHM). Furthermore, a high-activity Mini Deluxe Phantom (Data Spectrum Corp., Durham, NC, USA) SPECT resolution phantom was scanned using a Gamma Medica X-SPECT system and the signal-to-noise ratio (SNR) and structural similarity of reconstructed images was compared at various projection counts. Based on the reconstructed point-source phantom, the proposed distance-driven approach results in a lower FWHM than the ray-driven approach even when using a smaller detector resolution. Furthermore, based on the Mini Deluxe Phantom, it is shown that the distance-driven approach has consistently higher SNR and structural similarity compared to the ray-driven approach as the counts in measured projections deteriorates.
Quantitative metrics for evaluating the phased roll-out of clinical information systems.

PubMed

Wong, David; Wu, Nicolas; Watkinson, Peter

2017-09-01

We introduce a novel quantitative approach for evaluating the order of roll-out during phased introduction of clinical information systems. Such roll-outs are associated with unavoidable risk due to patients transferring between clinical areas using both the old and new systems. We proposed a simple graphical model of patient flow through a hospital. Using a simple instance of the model, we showed how a roll-out order can be generated by minimising the flow of patients from the new system to the old system. The model was applied to admission and discharge data acquired from 37,080 patient journeys at the Churchill Hospital, Oxford between April 2013 and April 2014. The resulting order was evaluated empirically and produced acceptable orders. The development of data-driven approaches to clinical Information system roll-out provides insights that may not necessarily be ascertained through clinical judgment alone. Such methods could make a significant contribution to the smooth running of an organisation during the roll-out of a potentially disruptive technology. Unlike previous approaches, which are based on clinical opinion, the approach described here quantitatively assesses the appropriateness of competing roll-out strategies. The data-driven approach was shown to produce strategies that matched clinical intuition and provides a flexible framework that may be used to plan and monitor Clinical Information System roll-out. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Multilingual Phoneme Models for Rapid Speech Processing System Development

DTIC Science & Technology

2006-09-01

processes are used to develop an Arabic speech recognition system starting from monolingual English models, In- ternational Phonetic Association (IPA...clusters. It was found that multilingual bootstrapping methods out- perform monolingual English bootstrapping methods on the Arabic evaluation data initially...International Phonetic Alphabet . . . . . . . . . 7 2.3.2 Multilingual vs. Monolingual Speech Recognition 7 2.3.3 Data-Driven Approaches
Livestock Helminths in a Changing Climate: Approaches and Restrictions to Meaningful Predictions

PubMed Central

Fox, Naomi J.; Marion, Glenn; Davidson, Ross S.; White, Piran C. L.; Hutchings, Michael R.

2012-01-01

Simple Summary Parasitic helminths represent one of the most pervasive challenges to livestock, and their intensity and distribution will be influenced by climate change. There is a need for long-term predictions to identify potential risks and highlight opportunities for control. We explore the approaches to modelling future helminth risk to livestock under climate change. One of the limitations to model creation is the lack of purpose driven data collection. We also conclude that models need to include a broad view of the livestock system to generate meaningful predictions. Abstract Climate change is a driving force for livestock parasite risk. This is especially true for helminths including the nematodes Haemonchus contortus, Teladorsagia circumcincta, Nematodirus battus, and the trematode Fasciola hepatica, since survival and development of free-living stages is chiefly affected by temperature and moisture. The paucity of long term predictions of helminth risk under climate change has driven us to explore optimal modelling approaches and identify current bottlenecks to generating meaningful predictions. We classify approaches as correlative or mechanistic, exploring their strengths and limitations. Climate is one aspect of a complex system and, at the farm level, husbandry has a dominant influence on helminth transmission. Continuing environmental change will necessitate the adoption of mitigation and adaptation strategies in husbandry. Long term predictive models need to have the architecture to incorporate these changes. Ultimately, an optimal modelling approach is likely to combine mechanistic processes and physiological thresholds with correlative bioclimatic modelling, incorporating changes in livestock husbandry and disease control. Irrespective of approach, the principal limitation to parasite predictions is the availability of active surveillance data and empirical data on physiological responses to climate variables. By combining improved empirical data and refined models with a broad view of the livestock system, robust projections of helminth risk can be developed. PMID:26486780
The application of domain-driven design in NMS

NASA Astrophysics Data System (ADS)

Zhang, Jinsong; Chen, Yan; Qin, Shengjun

2011-12-01

In the traditional design approach of data-model-driven, system analysis and design phases are often separated which makes the demand information can not be expressed explicitly. The method is also easy to lead developer to the process-oriented programming, making codes between the modules or between hierarchies disordered. So it is hard to meet requirement of system scalability. The paper proposes a software hiberarchy based on rich domain model according to domain-driven design named FHRDM, then the Webwork + Spring + Hibernate (WSH) framework is determined. Domain-driven design aims to construct a domain model which not only meets the demand of the field where the software exists but also meets the need of software development. In this way, problems in Navigational Maritime System (NMS) development like big system business volumes, difficulty of requirement elicitation, high development costs and long development cycle can be resolved successfully.
Conducting requirements analyses for research using routinely collected health data: a model driven approach.

PubMed

de Lusignan, Simon; Cashman, Josephine; Poh, Norman; Michalakidis, Georgios; Mason, Aaron; Desombre, Terry; Krause, Paul

2012-01-01

Medical research increasingly requires the linkage of data from different sources. Conducting a requirements analysis for a new application is an established part of software engineering, but rarely reported in the biomedical literature; and no generic approaches have been published as to how to link heterogeneous health data. Literature review, followed by a consensus process to define how requirements for research, using, multiple data sources might be modeled. We have developed a requirements analysis: i-ScheDULEs - The first components of the modeling process are indexing and create a rich picture of the research study. Secondly, we developed a series of reference models of progressive complexity: Data flow diagrams (DFD) to define data requirements; unified modeling language (UML) use case diagrams to capture study specific and governance requirements; and finally, business process models, using business process modeling notation (BPMN). These requirements and their associated models should become part of research study protocols.
Developing a Data Driven Process-Based Model for Remote Sensing of Ecosystem Production

NASA Astrophysics Data System (ADS)

Elmasri, B.; Rahman, A. F.

2010-12-01

Estimating ecosystem carbon fluxes at various spatial and temporal scales is essential for quantifying the global carbon cycle. Numerous models have been developed for this purpose using several environmental variables as well as vegetation indices derived from remotely sensed data. Here we present a data driven modeling approach for gross primary production (GPP) that is based on a process based model BIOME-BGC. The proposed model was run using available remote sensing data and it does not depend on look-up tables. Furthermore, this approach combines the merits of both empirical and process models, and empirical models were used to estimate certain input variables such as light use efficiency (LUE). This was achieved by using remotely sensed data to the mathematical equations that represent biophysical photosynthesis processes in the BIOME-BGC model. Moreover, a new spectral index for estimating maximum photosynthetic activity, maximum photosynthetic rate index (MPRI), is also developed and presented here. This new index is based on the ratio between the near infrared and the green bands (ρ858.5/ρ555). The model was tested and validated against MODIS GPP product and flux measurements from two eddy covariance flux towers located at Morgan Monroe State Forest (MMSF) in Indiana and Harvard Forest in Massachusetts. Satellite data acquired by the Advanced Microwave Scanning Radiometer (AMSR-E) and MODIS were used. The data driven model showed a strong correlation between the predicted and measured GPP at the two eddy covariance flux towers sites. This methodology produced better predictions of GPP than did the MODIS GPP product. Moreover, the proportion of error in the predicted GPP for MMSF and Harvard forest was dominated by unsystematic errors suggesting that the results are unbiased. The analysis indicated that maintenance respiration is one of the main factors that dominate the overall model outcome errors and improvement in maintenance respiration estimation will result in improved GPP predictions. Although there might be a room for improvements in our model outcomes through improved parameterization, our results suggest that such a methodology for running BIOME-BGC model based entirely on routinely available data can produce good predictions of GPP.
Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators.

PubMed

Xu, Wenjun; Chen, Jie; Lau, Henry Y K; Ren, Hongliang

2017-09-01

Accurate motion control of flexible surgical manipulators is crucial in tissue manipulation tasks. The tendon-driven serpentine manipulator (TSM) is one of the most widely adopted flexible mechanisms in minimally invasive surgery because of its enhanced maneuverability in torturous environments. TSM, however, exhibits high nonlinearities and conventional analytical kinematics model is insufficient to achieve high accuracy. To account for the system nonlinearities, we applied a data driven approach to encode the system inverse kinematics. Three regression methods: extreme learning machine (ELM), Gaussian mixture regression (GMR) and K-nearest neighbors regression (KNNR) were implemented to learn a nonlinear mapping from the robot 3D position states to the control inputs. The performance of the three algorithms was evaluated both in simulation and physical trajectory tracking experiments. KNNR performed the best in the tracking experiments, with the lowest RMSE of 2.1275 mm. The proposed inverse kinematics learning methods provide an alternative and efficient way to accurately model the tendon driven flexible manipulator. Copyright © 2016 John Wiley & Sons, Ltd.
Bridging paradigms: hybrid mechanistic-discriminative predictive models.

PubMed

Doyle, Orla M; Tsaneva-Atansaova, Krasimira; Harte, James; Tiffin, Paul A; Tino, Peter; Díaz-Zuccarini, Vanessa

2013-03-01

Many disease processes are extremely complex and characterized by multiple stochastic processes interacting simultaneously. Current analytical approaches have included mechanistic models and machine learning (ML), which are often treated as orthogonal viewpoints. However, to facilitate truly personalized medicine, new perspectives may be required. This paper reviews the use of both mechanistic models and ML in healthcare as well as emerging hybrid methods, which are an exciting and promising approach for biologically based, yet data-driven advanced intelligent systems.
Quantifying model-structure- and parameter-driven uncertainties in spring wheat phenology prediction with Bayesian analysis

DOE PAGES

Alderman, Phillip D.; Stanfill, Bryan

2016-10-06

Recent international efforts have brought renewed emphasis on the comparison of different agricultural systems models. Thus far, analysis of model-ensemble simulated results has not clearly differentiated between ensemble prediction uncertainties due to model structural differences per se and those due to parameter value uncertainties. Additionally, despite increasing use of Bayesian parameter estimation approaches with field-scale crop models, inadequate attention has been given to the full posterior distributions for estimated parameters. The objectives of this study were to quantify the impact of parameter value uncertainty on prediction uncertainty for modeling spring wheat phenology using Bayesian analysis and to assess the relativemore » contributions of model-structure-driven and parameter-value-driven uncertainty to overall prediction uncertainty. This study used a random walk Metropolis algorithm to estimate parameters for 30 spring wheat genotypes using nine phenology models based on multi-location trial data for days to heading and days to maturity. Across all cases, parameter-driven uncertainty accounted for between 19 and 52% of predictive uncertainty, while model-structure-driven uncertainty accounted for between 12 and 64%. Here, this study demonstrated the importance of quantifying both model-structure- and parameter-value-driven uncertainty when assessing overall prediction uncertainty in modeling spring wheat phenology. More generally, Bayesian parameter estimation provided a useful framework for quantifying and analyzing sources of prediction uncertainty.« less
Prognostics of Power Mosfets Under Thermal Stress Accelerated Aging Using Data-Driven and Model-Based Methodologies

NASA Technical Reports Server (NTRS)

Celaya, Jose; Saxena, Abhinav; Saha, Sankalita; Goebel, Kai F.

2011-01-01

An approach for predicting remaining useful life of power MOSFETs (metal oxide field effect transistor) devices has been developed. Power MOSFETs are semiconductor switching devices that are instrumental in electronics equipment such as those used in operation and control of modern aircraft and spacecraft. The MOSFETs examined here were aged under thermal overstress in a controlled experiment and continuous performance degradation data were collected from the accelerated aging experiment. Dieattach degradation was determined to be the primary failure mode. The collected run-to-failure data were analyzed and it was revealed that ON-state resistance increased as die-attach degraded under high thermal stresses. Results from finite element simulation analysis support the observations from the experimental data. Data-driven and model based prognostics algorithms were investigated where ON-state resistance was used as the primary precursor of failure feature. A Gaussian process regression algorithm was explored as an example for a data-driven technique and an extended Kalman filter and a particle filter were used as examples for model-based techniques. Both methods were able to provide valid results. Prognostic performance metrics were employed to evaluate and compare the algorithms.
Dynamically adaptive data-driven simulation of extreme hydrological flows

NASA Astrophysics Data System (ADS)

Kumar Jain, Pushkar; Mandli, Kyle; Hoteit, Ibrahim; Knio, Omar; Dawson, Clint

2018-02-01

Hydrological hazards such as storm surges, tsunamis, and rainfall-induced flooding are physically complex events that are costly in loss of human life and economic productivity. Many such disasters could be mitigated through improved emergency evacuation in real-time and through the development of resilient infrastructure based on knowledge of how systems respond to extreme events. Data-driven computational modeling is a critical technology underpinning these efforts. This investigation focuses on the novel combination of methodologies in forward simulation and data assimilation. The forward geophysical model utilizes adaptive mesh refinement (AMR), a process by which a computational mesh can adapt in time and space based on the current state of a simulation. The forward solution is combined with ensemble based data assimilation methods, whereby observations from an event are assimilated into the forward simulation to improve the veracity of the solution, or used to invert for uncertain physical parameters. The novelty in our approach is the tight two-way coupling of AMR and ensemble filtering techniques. The technology is tested using actual data from the Chile tsunami event of February 27, 2010. These advances offer the promise of significantly transforming data-driven, real-time modeling of hydrological hazards, with potentially broader applications in other science domains.
Extreme learning machine for reduced order modeling of turbulent geophysical flows.

PubMed

San, Omer; Maulik, Romit

2018-04-01

We investigate the application of artificial neural networks to stabilize proper orthogonal decomposition-based reduced order models for quasistationary geophysical turbulent flows. An extreme learning machine concept is introduced for computing an eddy-viscosity closure dynamically to incorporate the effects of the truncated modes. We consider a four-gyre wind-driven ocean circulation problem as our prototype setting to assess the performance of the proposed data-driven approach. Our framework provides a significant reduction in computational time and effectively retains the dynamics of the full-order model during the forward simulation period beyond the training data set. Furthermore, we show that the method is robust for larger choices of time steps and can be used as an efficient and reliable tool for long time integration of general circulation models.
Extreme learning machine for reduced order modeling of turbulent geophysical flows

NASA Astrophysics Data System (ADS)

San, Omer; Maulik, Romit

2018-04-01

We investigate the application of artificial neural networks to stabilize proper orthogonal decomposition-based reduced order models for quasistationary geophysical turbulent flows. An extreme learning machine concept is introduced for computing an eddy-viscosity closure dynamically to incorporate the effects of the truncated modes. We consider a four-gyre wind-driven ocean circulation problem as our prototype setting to assess the performance of the proposed data-driven approach. Our framework provides a significant reduction in computational time and effectively retains the dynamics of the full-order model during the forward simulation period beyond the training data set. Furthermore, we show that the method is robust for larger choices of time steps and can be used as an efficient and reliable tool for long time integration of general circulation models.
Improving the Fitness of High-Dimensional Biomechanical Models via Data-Driven Stochastic Exploration

PubMed Central

Bustamante, Carlos D.; Valero-Cuevas, Francisco J.

2010-01-01

The field of complex biomechanical modeling has begun to rely on Monte Carlo techniques to investigate the effects of parameter variability and measurement uncertainty on model outputs, search for optimal parameter combinations, and define model limitations. However, advanced stochastic methods to perform data-driven explorations, such as Markov chain Monte Carlo (MCMC), become necessary as the number of model parameters increases. Here, we demonstrate the feasibility and, what to our knowledge is, the first use of an MCMC approach to improve the fitness of realistically large biomechanical models. We used a Metropolis–Hastings algorithm to search increasingly complex parameter landscapes (3, 8, 24, and 36 dimensions) to uncover underlying distributions of anatomical parameters of a “truth model” of the human thumb on the basis of simulated kinematic data (thumbnail location, orientation, and linear and angular velocities) polluted by zero-mean, uncorrelated multivariate Gaussian “measurement noise.” Driven by these data, ten Markov chains searched each model parameter space for the subspace that best fit the data (posterior distribution). As expected, the convergence time increased, more local minima were found, and marginal distributions broadened as the parameter space complexity increased. In the 36-D scenario, some chains found local minima but the majority of chains converged to the true posterior distribution (confirmed using a cross-validation dataset), thus demonstrating the feasibility and utility of these methods for realistically large biomechanical problems. PMID:19272906
Facial First Impressions Across Culture: Data-Driven Modeling of Chinese and British Perceivers' Unconstrained Facial Impressions.

PubMed

Sutherland, Clare A M; Liu, Xizi; Zhang, Lingshan; Chu, Yingtung; Oldmeadow, Julian A; Young, Andrew W

2018-04-01

People form first impressions from facial appearance rapidly, and these impressions can have considerable social and economic consequences. Three dimensions can explain Western perceivers' impressions of Caucasian faces: approachability, youthful-attractiveness, and dominance. Impressions along these dimensions are theorized to be based on adaptive cues to threat detection or sexual selection, making it likely that they are universal. We tested whether the same dimensions of facial impressions emerge across culture by building data-driven models of first impressions of Asian and Caucasian faces derived from Chinese and British perceivers' unconstrained judgments. We then cross-validated the dimensions with computer-generated average images. We found strong evidence for common approachability and youthful-attractiveness dimensions across perceiver and face race, with some evidence of a third dimension akin to capability. The models explained ~75% of the variance in facial impressions. In general, the findings demonstrate substantial cross-cultural agreement in facial impressions, especially on the most salient dimensions.
A Model-Driven Development Method for Management Information Systems

NASA Astrophysics Data System (ADS)

Mizuno, Tomoki; Matsumoto, Keinosuke; Mori, Naoki

Traditionally, a Management Information System (MIS) has been developed without using formal methods. By the informal methods, the MIS is developed on its lifecycle without having any models. It causes many problems such as lack of the reliability of system design specifications. In order to overcome these problems, a model theory approach was proposed. The approach is based on an idea that a system can be modeled by automata and set theory. However, it is very difficult to generate automata of the system to be developed right from the start. On the other hand, there is a model-driven development method that can flexibly correspond to changes of business logics or implementing technologies. In the model-driven development, a system is modeled using a modeling language such as UML. This paper proposes a new development method for management information systems applying the model-driven development method to a component of the model theory approach. The experiment has shown that a reduced amount of efforts is more than 30% of all the efforts.
A data-driven emulation framework for representing water-food nexus in a changing cold region

NASA Astrophysics Data System (ADS)

Nazemi, A.; Zandmoghaddam, S.; Hatami, S.

2017-12-01

Water resource systems are under increasing pressure globally. Growing population along with competition between water demands and emerging effects of climate change have caused enormous vulnerabilities in water resource management across many regions. Diagnosing such vulnerabilities and provision of effective adaptation strategies requires the availability of simulation tools that can adequately represent the interactions between competing water demands for limiting water resources and inform decision makers about the critical vulnerability thresholds under a range of potential natural and anthropogenic conditions. Despite a significant progress in integrated modeling of water resource systems, regional models are often unable to fully represent the contemplating dynamics within the key elements of water resource systems locally. Here we propose a data-driven approach to emulate a complex regional water resource system model developed for Oldman River Basin in southern Alberta, Canada. The aim of the emulation is to provide a detailed understanding of the trade-offs and interaction at the Oldman Reservoir, which is the key to flood control and irrigated agriculture in this over-allocated semi-arid cold region. Different surrogate models are developed to represent the dynamic of irrigation demand and withdrawal as well as reservoir evaporation and release individually. The nan-falsified offline models are then integrated through the water balance equation at the reservoir location to provide a coupled model for representing the dynamic of reservoir operation and water allocation at the local scale. The performance of individual and integrated models are rigorously examined and sources of uncertainty are highlighted. To demonstrate the practical utility of such surrogate modeling approach, we use the integrated data-driven model for examining the trade-off in irrigation water supply, reservoir storage and release under a range of changing climate, upstream streamflow and local irrigation conditions.
Inter-subject phase synchronization for exploratory analysis of task-fMRI.

PubMed

Bolt, Taylor; Nomi, Jason S; Vij, Shruti G; Chang, Catie; Uddin, Lucina Q

2018-08-01

Analysis of task-based fMRI data is conventionally carried out using a hypothesis-driven approach, where blood-oxygen-level dependent (BOLD) time courses are correlated with a hypothesized temporal structure. In some experimental designs, this temporal structure can be difficult to define. In other cases, experimenters may wish to take a more exploratory, data-driven approach to detecting task-driven BOLD activity. In this study, we demonstrate the efficiency and power of an inter-subject synchronization approach for exploratory analysis of task-based fMRI data. Combining the tools of instantaneous phase synchronization and independent component analysis, we characterize whole-brain task-driven responses in terms of group-wise similarity in temporal signal dynamics of brain networks. We applied this framework to fMRI data collected during performance of a simple motor task and a social cognitive task. Analyses using an inter-subject phase synchronization approach revealed a large number of brain networks that dynamically synchronized to various features of the task, often not predicted by the hypothesized temporal structure of the task. We suggest that this methodological framework, along with readily available tools in the fMRI community, provides a powerful exploratory, data-driven approach for analysis of task-driven BOLD activity. Copyright © 2018 Elsevier Inc. All rights reserved.

Effective Rating Scale Development for Speaking Tests: Performance Decision Trees

ERIC Educational Resources Information Center

Fulcher, Glenn; Davidson, Fred; Kemp, Jenny

2011-01-01

Rating scale design and development for testing speaking is generally conducted using one of two approaches: the measurement-driven approach or the performance data-driven approach. The measurement-driven approach prioritizes the ordering of descriptors onto a single scale. Meaning is derived from the scaling methodology and the agreement of…
Redefining the Practice of Peer Review Through Intelligent Automation Part 2: Data-Driven Peer Review Selection and Assignment.

PubMed

Reiner, Bruce I

2017-12-01

In conventional radiology peer review practice, a small number of exams (routinely 5% of the total volume) is randomly selected, which may significantly underestimate the true error rate within a given radiology practice. An alternative and preferable approach would be to create a data-driven model which mathematically quantifies a peer review risk score for each individual exam and uses this data to identify high risk exams and readers, and selectively target these exams for peer review. An analogous model can also be created to assist in the assignment of these peer review cases in keeping with specific priorities of the service provider. An additional option to enhance the peer review process would be to assign the peer review cases in a truly blinded fashion. In addition to eliminating traditional peer review bias, this approach has the potential to better define exam-specific standard of care, particularly when multiple readers participate in the peer review process.
A New Multivariate Approach for Prognostics Based on Extreme Learning Machine and Fuzzy Clustering.

PubMed

Javed, Kamran; Gouriveau, Rafael; Zerhouni, Noureddine

2015-12-01

Prognostics is a core process of prognostics and health management (PHM) discipline, that estimates the remaining useful life (RUL) of a degrading machinery to optimize its service delivery potential. However, machinery operates in a dynamic environment and the acquired condition monitoring data are usually noisy and subject to a high level of uncertainty/unpredictability, which complicates prognostics. The complexity further increases, when there is absence of prior knowledge about ground truth (or failure definition). For such issues, data-driven prognostics can be a valuable solution without deep understanding of system physics. This paper contributes a new data-driven prognostics approach namely, an "enhanced multivariate degradation modeling," which enables modeling degrading states of machinery without assuming a homogeneous pattern. In brief, a predictability scheme is introduced to reduce the dimensionality of the data. Following that, the proposed prognostics model is achieved by integrating two new algorithms namely, the summation wavelet-extreme learning machine and subtractive-maximum entropy fuzzy clustering to show evolution of machine degradation by simultaneous predictions and discrete state estimation. The prognostics model is equipped with a dynamic failure threshold assignment procedure to estimate RUL in a realistic manner. To validate the proposition, a case study is performed on turbofan engines data from PHM challenge 2008 (NASA), and results are compared with recent publications.
A Unified Data-Driven Approach for Programming In Situ Analysis and Visualization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aiken, Alex

The placement and movement of data is becoming the key limiting factor on both performance and energy efficiency of high performance computations. As systems generate more data, it is becoming increasingly difficult to actually move that data elsewhere for post-processing, as the rate of improvements in supporting I/O infrastructure is not keeping pace. Together, these trends are creating a shift in how we think about exascale computations, from a viewpoint that focuses on FLOPS to one that focuses on data and data-centric operations as fundamental to the reasoning about, and optimization of, scientific workflows on extreme-scale architectures. The overarching goalmore » of our effort was the study of a unified data-driven approach for programming applications and in situ analysis and visualization. Our work was to understand the interplay between data-centric programming model requirements at extreme-scale and the overall impact of those requirements on the design, capabilities, flexibility, and implementation details for both applications and the supporting in situ infrastructure. In this context, we made many improvements to the Legion programming system (one of the leading data-centric models today) and demonstrated in situ analyses on real application codes using these improvements.« less
A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.

PubMed

Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel

2015-03-23

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.
A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines

PubMed Central

Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González

2015-01-01

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876
Multi-Parametric Analysis and Modeling of Relationships between Mitochondrial Morphology and Apoptosis

PubMed Central

Reis, Yara; Wolf, Thomas; Brors, Benedikt; Hamacher-Brady, Anne; Eils, Roland; Brady, Nathan R.

2012-01-01

Mitochondria exist as a network of interconnected organelles undergoing constant fission and fusion. Current approaches to study mitochondrial morphology are limited by low data sampling coupled with manual identification and classification of complex morphological phenotypes. Here we propose an integrated mechanistic and data-driven modeling approach to analyze heterogeneous, quantified datasets and infer relations between mitochondrial morphology and apoptotic events. We initially performed high-content, multi-parametric measurements of mitochondrial morphological, apoptotic, and energetic states by high-resolution imaging of human breast carcinoma MCF-7 cells. Subsequently, decision tree-based analysis was used to automatically classify networked, fragmented, and swollen mitochondrial subpopulations, at the single-cell level and within cell populations. Our results revealed subtle but significant differences in morphology class distributions in response to various apoptotic stimuli. Furthermore, key mitochondrial functional parameters including mitochondrial membrane potential and Bax activation, were measured under matched conditions. Data-driven fuzzy logic modeling was used to explore the non-linear relationships between mitochondrial morphology and apoptotic signaling, combining morphological and functional data as a single model. Modeling results are in accordance with previous studies, where Bax regulates mitochondrial fragmentation, and mitochondrial morphology influences mitochondrial membrane potential. In summary, we established and validated a platform for mitochondrial morphological and functional analysis that can be readily extended with additional datasets. We further discuss the benefits of a flexible systematic approach for elucidating specific and general relationships between mitochondrial morphology and apoptosis. PMID:22272225
A Model-Driven Approach to e-Course Management

ERIC Educational Resources Information Center

Savic, Goran; Segedinac, Milan; Milenkovic, Dušica; Hrin, Tamara; Segedinac, Mirjana

2018-01-01

This paper presents research on using a model-driven approach to the development and management of electronic courses. We propose a course management system which stores a course model represented as distinct machine-readable components containing domain knowledge of different course aspects. Based on this formally defined platform-independent…
Data-Driven Mechanistic Modeling of Influence of Microstructure on High-Cycle Fatigue Life of Nickel Titanium

NASA Astrophysics Data System (ADS)

Kafka, Orion L.; Yu, Cheng; Shakoor, Modesar; Liu, Zeliang; Wagner, Gregory J.; Liu, Wing Kam

2018-04-01

A data-driven mechanistic modeling technique is applied to a system representative of a broken-up inclusion ("stringer") within drawn nickel-titanium wire or tube, e.g., as used for arterial stents. The approach uses a decomposition of the problem into a training stage and a prediction stage. It is applied to compute the fatigue crack incubation life of a microstructure of interest under high-cycle fatigue. A parametric study of a matrix-inclusion-void microstructure is conducted. The results indicate that, within the range studied, a larger void between halves of the inclusion increases fatigue life, while larger inclusion diameter reduces fatigue life.
As above, so below? Towards understanding inverse models in BCI

NASA Astrophysics Data System (ADS)

Lindgren, Jussi T.

2018-02-01

Objective. In brain-computer interfaces (BCI), measurements of the user’s brain activity are classified into commands for the computer. With EEG-based BCIs, the origins of the classified phenomena are often considered to be spatially localized in the cortical volume and mixed in the EEG. We investigate if more accurate BCIs can be obtained by reconstructing the source activities in the volume. Approach. We contrast the physiology-driven source reconstruction with data-driven representations obtained by statistical machine learning. We explain these approaches in a common linear dictionary framework and review the different ways to obtain the dictionary parameters. We consider the effect of source reconstruction on some major difficulties in BCI classification, namely information loss, feature selection and nonstationarity of the EEG. Main results. Our analysis suggests that the approaches differ mainly in their parameter estimation. Physiological source reconstruction may thus be expected to improve BCI accuracy if machine learning is not used or where it produces less optimal parameters. We argue that the considered difficulties of surface EEG classification can remain in the reconstructed volume and that data-driven techniques are still necessary. Finally, we provide some suggestions for comparing approaches. Significance. The present work illustrates the relationships between source reconstruction and machine learning-based approaches for EEG data representation. The provided analysis and discussion should help in understanding, applying, comparing and improving such techniques in the future.
CP violation in multibody B decays from QCD factorization

NASA Astrophysics Data System (ADS)

Klein, Rebecca; Mannel, Thomas; Virto, Javier; Vos, K. Keri

2017-10-01

We test a data-driven approach based on QCD factorization for charmless three-body B-decays by confronting it to measurements of CP violation in B - → π - π + π -. While some of the needed non-perturbative objects can be directly extracted from data, some others can, so far, only be modelled. Although this approach is currently model dependent, we comment on the perspectives to reduce this model dependence. While our model naturally accommodates the gross features of the Dalitz distribution, it cannot quantitatively explain the details seen in the current experimental data on local CP asymmetries. We comment on possible refinements of our simple model and conclude by briefly discussing a possible extension of the model to large invariant masses, where large local CP asymmetries have been measured.
A standard-driven approach for electronic submission to pharmaceutical regulatory authorities.

PubMed

Lin, Ching-Heng; Chou, Hsin-I; Yang, Ueng-Cheng

2018-03-01

Using standards is not only useful for data interchange during the process of a clinical trial, but also useful for analyzing data in a review process. Any step, which speeds up approval of new drugs, may benefit patients. As a result, adopting standards for regulatory submission becomes mandatory in some countries. However, preparing standard-compliant documents, such as annotated case report form (aCRF), needs a great deal of knowledge and experience. The process is complex and labor-intensive. Therefore, there is a need to use information technology to facilitate this process. Instead of standardizing data after the completion of a clinical trial, this study proposed a standard-driven approach. This approach was achieved by implementing a computer-assisted "standard-driven pipeline (SDP)" in an existing clinical data management system. SDP used CDISC standards to drive all processes of a clinical trial, such as the design, data acquisition, tabulation, etc. RESULTS: A completed phase I/II trial was used to prove the concept and to evaluate the effects of this approach. By using the CDISC-compliant question library, aCRFs were generated automatically when the eCRFs were completed. For comparison purpose, the data collection process was simulated and the collected data was transformed by the SDP. This new approach reduced the missing data fields from sixty-two to eight and the controlled term mismatch field reduced from eight to zero during data tabulation. This standard-driven approach accelerated CRF annotation and assured data tabulation integrity. The benefits of this approach include an improvement in the use of standards during the clinical trial and a reduction in missing and unexpected data during tabulation. The standard-driven approach is an advanced design idea that can be used for future clinical information system development. Copyright © 2018 Elsevier Inc. All rights reserved.
A predictive modeling approach to increasing the economic effectiveness of disease management programs.

PubMed

Bayerstadler, Andreas; Benstetter, Franz; Heumann, Christian; Winter, Fabian

2014-09-01

Predictive Modeling (PM) techniques are gaining importance in the worldwide health insurance business. Modern PM methods are used for customer relationship management, risk evaluation or medical management. This article illustrates a PM approach that enables the economic potential of (cost-) effective disease management programs (DMPs) to be fully exploited by optimized candidate selection as an example of successful data-driven business management. The approach is based on a Generalized Linear Model (GLM) that is easy to apply for health insurance companies. By means of a small portfolio from an emerging country, we show that our GLM approach is stable compared to more sophisticated regression techniques in spite of the difficult data environment. Additionally, we demonstrate for this example of a setting that our model can compete with the expensive solutions offered by professional PM vendors and outperforms non-predictive standard approaches for DMP selection commonly used in the market.
Measuring Experiential Avoidance: A Preliminary Test of a Working Model

ERIC Educational Resources Information Center

Hayes, Steven C.; Strosahl, Kirk; Wilson, Kelly G.; Bissett, Richard T.; Pistorello, Jacqueline; Toarmino, Dosheen; Polusny, Melissa A.; Dykstra, Thane A.; Batten, Sonja V.; Bergan, John; Stewart, Sherry H.; Zvolensky, Michael J.; Eifert, Georg H.; Bond, Frank W.; Forsyth, John P.; Karekla, Maria; Mccurry, Susan M.

2004-01-01

The present study describes the development of a short, general measure of experiential avoidance, based on a specific theoretical approach to this process. A theoretically driven iterative exploratory analysis using structural equation modeling on data from a clinical sample yielded a single factor comprising 9 items. A fully confirmatory factor…
Improving the modelling of irradiation-induced brain activation for in vivo PET verification of proton therapy.

PubMed

Bauer, Julia; Chen, Wenjing; Nischwitz, Sebastian; Liebl, Jakob; Rieken, Stefan; Welzel, Thomas; Debus, Juergen; Parodi, Katia

2018-04-24

A reliable Monte Carlo prediction of proton-induced brain tissue activation used for comparison to particle therapy positron-emission-tomography (PT-PET) measurements is crucial for in vivo treatment verification. Major limitations of current approaches to overcome include the CT-based patient model and the description of activity washout due to tissue perfusion. Two approaches were studied to improve the activity prediction for brain irradiation: (i) a refined patient model using tissue classification based on MR information and (ii) a PT-PET data-driven refinement of washout model parameters. Improvements of the activity predictions compared to post-treatment PT-PET measurements were assessed in terms of activity profile similarity for six patients treated with a single or two almost parallel fields delivered by active proton beam scanning. The refined patient model yields a generally higher similarity for most of the patients, except in highly pathological areas leading to tissue misclassification. Using washout model parameters deduced from clinical patient data could considerably improve the activity profile similarity for all patients. Current methods used to predict proton-induced brain tissue activation can be improved with MR-based tissue classification and data-driven washout parameters, thus providing a more reliable basis for PT-PET verification. Copyright © 2018 Elsevier B.V. All rights reserved.
Assessment of cardiovascular risk based on a data-driven knowledge discovery approach.

PubMed

Mendes, D; Paredes, S; Rocha, T; Carvalho, P; Henriques, J; Cabiddu, R; Morais, J

2015-01-01

The cardioRisk project addresses the development of personalized risk assessment tools for patients who have been admitted to the hospital with acute myocardial infarction. Although there are models available that assess the short-term risk of death/new events for such patients, these models were established in circumstances that do not take into account the present clinical interventions and, in some cases, the risk factors used by such models are not easily available in clinical practice. The integration of the existing risk tools (applied in the clinician's daily practice) with data-driven knowledge discovery mechanisms based on data routinely collected during hospitalizations, will be a breakthrough in overcoming some of these difficulties. In this context, the development of simple and interpretable models (based on recent datasets), unquestionably will facilitate and will introduce confidence in this integration process. In this work, a simple and interpretable model based on a real dataset is proposed. It consists of a decision tree model structure that uses a reduced set of six binary risk factors. The validation is performed using a recent dataset provided by the Portuguese Society of Cardiology (11113 patients), which originally comprised 77 risk factors. A sensitivity, specificity and accuracy of, respectively, 80.42%, 77.25% and 78.80% were achieved showing the effectiveness of the approach.
Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data

NASA Astrophysics Data System (ADS)

Yin, Shen; Wang, Guang; Yang, Xu

2014-07-01

In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.
Mutational landscape of EGFR-, MYC-, and Kras-driven genetically engineered mouse models of lung adenocarcinoma

PubMed Central

McFadden, David G.; Politi, Katerina; Bhutkar, Arjun; Chen, Frances K.; Song, Xiaoling; Pirun, Mono; Santiago, Philip M.; Kim-Kiselak, Caroline; Platt, James T.; Lee, Emily; Hodges, Emily; Rosebrock, Adam P.; Bronson, Roderick T.; Socci, Nicholas D.; Hannon, Gregory J.; Jacks, Tyler; Varmus, Harold

2016-01-01

Genetically engineered mouse models (GEMMs) of cancer are increasingly being used to assess putative driver mutations identified by large-scale sequencing of human cancer genomes. To accurately interpret experiments that introduce additional mutations, an understanding of the somatic genetic profile and evolution of GEMM tumors is necessary. Here, we performed whole-exome sequencing of tumors from three GEMMs of lung adenocarcinoma driven by mutant epidermal growth factor receptor (EGFR), mutant Kirsten rat sarcoma viral oncogene homolog (Kras), or overexpression of MYC proto-oncogene. Tumors from EGFR- and Kras-driven models exhibited, respectively, 0.02 and 0.07 nonsynonymous mutations per megabase, a dramatically lower average mutational frequency than observed in human lung adenocarcinomas. Tumors from models driven by strong cancer drivers (mutant EGFR and Kras) harbored few mutations in known cancer genes, whereas tumors driven by MYC, a weaker initiating oncogene in the murine lung, acquired recurrent clonal oncogenic Kras mutations. In addition, although EGFR- and Kras-driven models both exhibited recurrent whole-chromosome DNA copy number alterations, the specific chromosomes altered by gain or loss were different in each model. These data demonstrate that GEMM tumors exhibit relatively simple somatic genotypes compared with human cancers of a similar type, making these autochthonous model systems useful for additive engineering approaches to assess the potential of novel mutations on tumorigenesis, cancer progression, and drug sensitivity. PMID:27702896
Mutational landscape of EGFR-, MYC-, and Kras-driven genetically engineered mouse models of lung adenocarcinoma.

PubMed

McFadden, David G; Politi, Katerina; Bhutkar, Arjun; Chen, Frances K; Song, Xiaoling; Pirun, Mono; Santiago, Philip M; Kim-Kiselak, Caroline; Platt, James T; Lee, Emily; Hodges, Emily; Rosebrock, Adam P; Bronson, Roderick T; Socci, Nicholas D; Hannon, Gregory J; Jacks, Tyler; Varmus, Harold

2016-10-18

Genetically engineered mouse models (GEMMs) of cancer are increasingly being used to assess putative driver mutations identified by large-scale sequencing of human cancer genomes. To accurately interpret experiments that introduce additional mutations, an understanding of the somatic genetic profile and evolution of GEMM tumors is necessary. Here, we performed whole-exome sequencing of tumors from three GEMMs of lung adenocarcinoma driven by mutant epidermal growth factor receptor (EGFR), mutant Kirsten rat sarcoma viral oncogene homolog (Kras), or overexpression of MYC proto-oncogene. Tumors from EGFR- and Kras-driven models exhibited, respectively, 0.02 and 0.07 nonsynonymous mutations per megabase, a dramatically lower average mutational frequency than observed in human lung adenocarcinomas. Tumors from models driven by strong cancer drivers (mutant EGFR and Kras) harbored few mutations in known cancer genes, whereas tumors driven by MYC, a weaker initiating oncogene in the murine lung, acquired recurrent clonal oncogenic Kras mutations. In addition, although EGFR- and Kras-driven models both exhibited recurrent whole-chromosome DNA copy number alterations, the specific chromosomes altered by gain or loss were different in each model. These data demonstrate that GEMM tumors exhibit relatively simple somatic genotypes compared with human cancers of a similar type, making these autochthonous model systems useful for additive engineering approaches to assess the potential of novel mutations on tumorigenesis, cancer progression, and drug sensitivity.
Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection.

PubMed

Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali

2017-01-01

Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports.

Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection

PubMed Central

Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali

2017-01-01

Objectives Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Methods Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Results Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. Conclusion The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports. PMID:28166263
Teachers Develop CLIL Materials in Argentina: A Workshop Experience

ERIC Educational Resources Information Center

Banegas, Darío Luis

2016-01-01

Content and language integrated learning (CLIL) is a Europe-born approach. Nevertheless, CLIL as a language learning approach has been implemented in Latin America in different ways and models: content-driven models and language-driven models. As regards the latter, new school curricula demand that CLIL be used in secondary education in Argentina…
Data-Driven Learning of Q-Matrix

PubMed Central

Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

2013-01-01

The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known Q-matrix, which specifies the item–attribute relationships. This article proposes a data-driven approach to identification of the Q-matrix and estimation of related model parameters. A key ingredient is a flexible T-matrix that relates the Q-matrix to response patterns. The flexibility of the T-matrix allows the construction of a natural criterion function as well as a computationally amenable algorithm. Simulations results are presented to demonstrate usefulness and applicability of the proposed method. Extension to handling of the Q-matrix with partial information is presented. The proposed method also provides a platform on which important statistical issues, such as hypothesis testing and model selection, may be formally addressed. PMID:23926363
Bayesian Model Development for Analysis of Open Source Information to Support the Assessment of Nuclear Programs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gastelum, Zoe N.; Whitney, Paul D.; White, Amanda M.

2013-07-15

Pacific Northwest National Laboratory has spent several years researching, developing, and validating large Bayesian network models to support integration of open source data sets for nuclear proliferation research. Our current work focuses on generating a set of interrelated models for multi-source assessment of nuclear programs, as opposed to a single comprehensive model. By using this approach, we can break down the models to cover logical sub-problems that can utilize different expertise and data sources. This approach allows researchers to utilize the models individually or in combination to detect and characterize a nuclear program and identify data gaps. The models operatemore » at various levels of granularity, covering a combination of state-level assessments with more detailed models of site or facility characteristics. This paper will describe the current open source-driven, nuclear nonproliferation models under development, the pros and cons of the analytical approach, and areas for additional research.« less
Data-driven inference of network connectivity for modeling the dynamics of neural codes in the insect antennal lobe

PubMed Central

Shlizerman, Eli; Riffell, Jeffrey A.; Kutz, J. Nathan

2014-01-01

The antennal lobe (AL), olfactory processing center in insects, is able to process stimuli into distinct neural activity patterns, called olfactory neural codes. To model their dynamics we perform multichannel recordings from the projection neurons in the AL driven by different odorants. We then derive a dynamic neuronal network from the electrophysiological data. The network consists of lateral-inhibitory neurons and excitatory neurons (modeled as firing-rate units), and is capable of producing unique olfactory neural codes for the tested odorants. To construct the network, we (1) design a projection, an odor space, for the neural recording from the AL, which discriminates between distinct odorants trajectories (2) characterize scent recognition, i.e., decision-making based on olfactory signals and (3) infer the wiring of the neural circuit, the connectome of the AL. We show that the constructed model is consistent with biological observations, such as contrast enhancement and robustness to noise. The study suggests a data-driven approach to answer a key biological question in identifying how lateral inhibitory neurons can be wired to excitatory neurons to permit robust activity patterns. PMID:25165442
Extended-Range Prediction with Low-Dimensional, Stochastic-Dynamic Models: A Data-driven Approach

DTIC Science & Technology

2012-09-30

characterization of extratropical storms and extremes and link these to LFV modes. Mingfang Ting, Yochanan Kushnir, Andrew W. Robertson...simulating and predicting a wide range of climate phenomena including ENSO, tropical Atlantic sea surface temperatures (SSTs), storm track variability...into empirical prediction models. Use observations to improve low-order dynamical MJO models. Adam Sobel, Daehyun Kim. Extratropical variability
An Observation-Driven Agent-Based Modeling and Analysis Framework for C. elegans Embryogenesis.

PubMed

Wang, Zi; Ramsey, Benjamin J; Wang, Dali; Wong, Kwai; Li, Husheng; Wang, Eric; Bao, Zhirong

2016-01-01

With cutting-edge live microscopy and image analysis, biologists can now systematically track individual cells in complex tissues and quantify cellular behavior over extended time windows. Computational approaches that utilize the systematic and quantitative data are needed to understand how cells interact in vivo to give rise to the different cell types and 3D morphology of tissues. An agent-based, minimum descriptive modeling and analysis framework is presented in this paper to study C. elegans embryogenesis. The framework is designed to incorporate the large amounts of experimental observations on cellular behavior and reserve data structures/interfaces that allow regulatory mechanisms to be added as more insights are gained. Observed cellular behaviors are organized into lineage identity, timing and direction of cell division, and path of cell movement. The framework also includes global parameters such as the eggshell and a clock. Division and movement behaviors are driven by statistical models of the observations. Data structures/interfaces are reserved for gene list, cell-cell interaction, cell fate and landscape, and other global parameters until the descriptive model is replaced by a regulatory mechanism. This approach provides a framework to handle the ongoing experiments of single-cell analysis of complex tissues where mechanistic insights lag data collection and need to be validated on complex observations.
Knowledge-driven computational modeling in Alzheimer's disease research: Current state and future trends.

PubMed

Geerts, Hugo; Hofmann-Apitius, Martin; Anastasio, Thomas J

2017-11-01

Neurodegenerative diseases such as Alzheimer's disease (AD) follow a slowly progressing dysfunctional trajectory, with a large presymptomatic component and many comorbidities. Using preclinical models and large-scale omics studies ranging from genetics to imaging, a large number of processes that might be involved in AD pathology at different stages and levels have been identified. The sheer number of putative hypotheses makes it almost impossible to estimate their contribution to the clinical outcome and to develop a comprehensive view on the pathological processes driving the clinical phenotype. Traditionally, bioinformatics approaches have provided correlations and associations between processes and phenotypes. Focusing on causality, a new breed of advanced and more quantitative modeling approaches that use formalized domain expertise offer new opportunities to integrate these different modalities and outline possible paths toward new therapeutic interventions. This article reviews three different computational approaches and their possible complementarities. Process algebras, implemented using declarative programming languages such as Maude, facilitate simulation and analysis of complicated biological processes on a comprehensive but coarse-grained level. A model-driven Integration of Data and Knowledge, based on the OpenBEL platform and using reverse causative reasoning and network jump analysis, can generate mechanistic knowledge and a new, mechanism-based taxonomy of disease. Finally, Quantitative Systems Pharmacology is based on formalized implementation of domain expertise in a more fine-grained, mechanism-driven, quantitative, and predictive humanized computer model. We propose a strategy to combine the strengths of these individual approaches for developing powerful modeling methodologies that can provide actionable knowledge for rational development of preventive and therapeutic interventions. Development of these computational approaches is likely to be required for further progress in understanding and treating AD. Copyright © 2017 the Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
Handling Real-World Context Awareness, Uncertainty and Vagueness in Real-Time Human Activity Tracking and Recognition with a Fuzzy Ontology-Based Hybrid Method

PubMed Central

Díaz-Rodríguez, Natalia; Cadahía, Olmo León; Cuéllar, Manuel Pegalajar; Lilius, Johan; Calvo-Flores, Miguel Delgado

2014-01-01

Human activity recognition is a key task in ambient intelligence applications to achieve proper ambient assisted living. There has been remarkable progress in this domain, but some challenges still remain to obtain robust methods. Our goal in this work is to provide a system that allows the modeling and recognition of a set of complex activities in real life scenarios involving interaction with the environment. The proposed framework is a hybrid model that comprises two main modules: a low level sub-activity recognizer, based on data-driven methods, and a high-level activity recognizer, implemented with a fuzzy ontology to include the semantic interpretation of actions performed by users. The fuzzy ontology is fed by the sub-activities recognized by the low level data-driven component and provides fuzzy ontological reasoning to recognize both the activities and their influence in the environment with semantics. An additional benefit of the approach is the ability to handle vagueness and uncertainty in the knowledge-based module, which substantially outperforms the treatment of incomplete and/or imprecise data with respect to classic crisp ontologies. We validate these advantages with the public CAD-120 dataset (Cornell Activity Dataset), achieving an accuracy of 90.1% and 91.07% for low-level and high-level activities, respectively. This entails an improvement over fully data-driven or ontology-based approaches. PMID:25268914
Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues.

PubMed

Gepperth, Alexander R T; Rebhan, Sven; Hasler, Stephan; Fritsch, Jannik

2011-03-01

In this contribution, we present a large-scale hierarchical system for object detection fusing bottom-up (signal-driven) processing results with top-down (model or task-driven) attentional modulation. Specifically, we focus on the question of how the autonomous learning of invariant models can be embedded into a performing system and how such models can be used to define object-specific attentional modulation signals. Our system implements bi-directional data flow in a processing hierarchy. The bottom-up data flow proceeds from a preprocessing level to the hypothesis level where object hypotheses created by exhaustive object detection algorithms are represented in a roughly retinotopic way. A competitive selection mechanism is used to determine the most confident hypotheses, which are used on the system level to train multimodal models that link object identity to invariant hypothesis properties. The top-down data flow originates at the system level, where the trained multimodal models are used to obtain space- and feature-based attentional modulation signals, providing biases for the competitive selection process at the hypothesis level. This results in object-specific hypothesis facilitation/suppression in certain image regions which we show to be applicable to different object detection mechanisms. In order to demonstrate the benefits of this approach, we apply the system to the detection of cars in a variety of challenging traffic videos. Evaluating our approach on a publicly available dataset containing approximately 3,500 annotated video images from more than 1 h of driving, we can show strong increases in performance and generalization when compared to object detection in isolation. Furthermore, we compare our results to a late hypothesis rejection approach, showing that early coupling of top-down and bottom-up information is a favorable approach especially when processing resources are constrained.
Air-Breathing Hypersonic Vehicle Tracking Control Based on Adaptive Dynamic Programming.

PubMed

Mu, Chaoxu; Ni, Zhen; Sun, Changyin; He, Haibo

2017-03-01

In this paper, we propose a data-driven supplementary control approach with adaptive learning capability for air-breathing hypersonic vehicle tracking control based on action-dependent heuristic dynamic programming (ADHDP). The control action is generated by the combination of sliding mode control (SMC) and the ADHDP controller to track the desired velocity and the desired altitude. In particular, the ADHDP controller observes the differences between the actual velocity/altitude and the desired velocity/altitude, and then provides a supplementary control action accordingly. The ADHDP controller does not rely on the accurate mathematical model function and is data driven. Meanwhile, it is capable to adjust its parameters online over time under various working conditions, which is very suitable for hypersonic vehicle system with parameter uncertainties and disturbances. We verify the adaptive supplementary control approach versus the traditional SMC in the cruising flight, and provide three simulation studies to illustrate the improved performance with the proposed approach.
Data driven propulsion system weight prediction model

NASA Astrophysics Data System (ADS)

Gerth, Richard J.

1994-10-01

The objective of the research was to develop a method to predict the weight of paper engines, i.e., engines that are in the early stages of development. The impetus for the project was the Single Stage To Orbit (SSTO) project, where engineers need to evaluate alternative engine designs. Since the SSTO is a performance driven project the performance models for alternative designs were well understood. The next tradeoff is weight. Since it is known that engine weight varies with thrust levels, a model is required that would allow discrimination between engines that produce the same thrust. Above all, the model had to be rooted in data with assumptions that could be justified based on the data. The general approach was to collect data on as many existing engines as possible and build a statistical model of the engines weight as a function of various component performance parameters. This was considered a reasonable level to begin the project because the data would be readily available, and it would be at the level of most paper engines, prior to detailed component design.
Indicators of ecosystem function identify alternate states in the sagebrush steppe.

PubMed

Kachergis, Emily; Rocca, Monique E; Fernandez-Gimenez, Maria E

2011-10-01

Models of ecosystem change that incorporate nonlinear dynamics and thresholds, such as state-and-transition models (STMs), are increasingly popular tools for land management decision-making. However, few models are based on systematic collection and documentation of ecological data, and of these, most rely solely on structural indicators (species composition) to identify states and transitions. As STMs are adopted as an assessment framework throughout the United States, finding effective and efficient ways to create data-driven models that integrate ecosystem function and structure is vital. This study aims to (1) evaluate the utility of functional indicators (indicators of rangeland health, IRH) as proxies for more difficult ecosystem function measurements and (2) create a data-driven STM for the sagebrush steppe of Colorado, USA, that incorporates both ecosystem structure and function. We sampled soils, plant communities, and IRH at 41 plots with similar clayey soils but different site histories to identify potential states and infer the effects of management practices and disturbances on transitions. We found that many IRH were correlated with quantitative measures of functional indicators, suggesting that the IRH can be used to approximate ecosystem function. In addition to a reference state that functions as expected for this soil type, we identified four biotically and functionally distinct potential states, consistent with the theoretical concept of alternate states. Three potential states were related to management practices (chemical and mechanical shrub treatments and seeding history) while one was related only to ecosystem processes (erosion). IRH and potential states were also related to environmental variation (slope, soil texture), suggesting that there are environmental factors within areas with similar soils that affect ecosystem dynamics and should be noted within STMs. Our approach generated an objective, data-driven model of ecosystem dynamics for rangeland management. Our findings suggest that the IRH approximate ecosystem processes and can distinguish between alternate states and communities and identify transitions when building data-driven STMs. Functional indicators are a simple, efficient way to create data-driven models that are consistent with alternate state theory. Managers can use them to improve current model-building methods and thus apply state-and-transition models more broadly for land management decision-making.
Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism

PubMed Central

Chang, Roger L; Ghamsari, Lila; Manichaikul, Ani; Hom, Erik F Y; Balaji, Santhanam; Fu, Weiqi; Shen, Yun; Hao, Tong; Palsson, Bernhard Ø; Salehi-Ashtiani, Kourosh; Papin, Jason A

2011-01-01

Metabolic network reconstruction encompasses existing knowledge about an organism's metabolism and genome annotation, providing a platform for omics data analysis and phenotype prediction. The model alga Chlamydomonas reinhardtii is employed to study diverse biological processes from photosynthesis to phototaxis. Recent heightened interest in this species results from an international movement to develop algal biofuels. Integrating biological and optical data, we reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology. PMID:21811229
Genetic Programming for Automatic Hydrological Modelling

NASA Astrophysics Data System (ADS)

Chadalawada, Jayashree; Babovic, Vladan

2017-04-01

One of the recent challenges for the hydrologic research community is the need for the development of coupled systems that involves the integration of hydrologic, atmospheric and socio-economic relationships. This poses a requirement for novel modelling frameworks that can accurately represent complex systems, given, the limited understanding of underlying processes, increasing volume of data and high levels of uncertainity. Each of the existing hydrological models vary in terms of conceptualization and process representation and is the best suited to capture the environmental dynamics of a particular hydrological system. Data driven approaches can be used in the integration of alternative process hypotheses in order to achieve a unified theory at catchment scale. The key steps in the implementation of integrated modelling framework that is influenced by prior understanding and data, include, choice of the technique for the induction of knowledge from data, identification of alternative structural hypotheses, definition of rules, constraints for meaningful, intelligent combination of model component hypotheses and definition of evaluation metrics. This study aims at defining a Genetic Programming based modelling framework that test different conceptual model constructs based on wide range of objective functions and evolves accurate and parsimonious models that capture dominant hydrological processes at catchment scale. In this paper, GP initializes the evolutionary process using the modelling decisions inspired from the Superflex framework [Fenicia et al., 2011] and automatically combines them into model structures that are scrutinized against observed data using statistical, hydrological and flow duration curve based performance metrics. The collaboration between data driven and physical, conceptual modelling paradigms improves the ability to model and manage hydrologic systems. Fenicia, F., D. Kavetski, and H. H. Savenije (2011), Elements of a flexible approach for conceptual hydrological modeling: 1. Motivation and theoretical development, Water Resources Research, 47(11).
Data-Driven Modeling of Complex Systems by means of a Dynamical ANN

NASA Astrophysics Data System (ADS)

Seleznev, A.; Mukhin, D.; Gavrilov, A.; Loskutov, E.; Feigin, A.

2017-12-01

The data-driven methods for modeling and prognosis of complex dynamical systems become more and more popular in various fields due to growth of high-resolution data. We distinguish the two basic steps in such an approach: (i) determining the phase subspace of the system, or embedding, from available time series and (ii) constructing an evolution operator acting in this reduced subspace. In this work we suggest a novel approach combining these two steps by means of construction of an artificial neural network (ANN) with special topology. The proposed ANN-based model, on the one hand, projects the data onto a low-dimensional manifold, and, on the other hand, models a dynamical system on this manifold. Actually, this is a recurrent multilayer ANN which has internal dynamics and capable of generating time series. Very important point of the proposed methodology is the optimization of the model allowing us to avoid overfitting: we use Bayesian criterion to optimize the ANN structure and estimate both the degree of evolution operator nonlinearity and the complexity of nonlinear manifold which the data are projected on. The proposed modeling technique will be applied to the analysis of high-dimensional dynamical systems: Lorenz'96 model of atmospheric turbulence, producing high-dimensional space-time chaos, and quasi-geostrophic three-layer model of the Earth's atmosphere with the natural orography, describing the dynamics of synoptical vortexes as well as mesoscale blocking systems. The possibility of application of the proposed methodology to analyze real measured data is also discussed. The study was supported by the Russian Science Foundation (grant #16-12-10198).
A Model for Facilitating Curriculum Development in Higher Education: A Faculty-Driven, Data-Informed, and Educational Developer-Supported Approach

ERIC Educational Resources Information Center

Wolf, Peter

2007-01-01

In the fall of 2003, Teaching Support Services (TSS), a department at the University of Guelph, was approached by a faculty member in the department of food sciences. Professor Art Hill was interested in seeking support in systematically assessing the department's undergraduate curriculum and using that assessment to trigger further improvement of…
The Elementary School Success Profile Model of Assessment and Prevention: Balancing Effective Practice Standards and Feasibility

ERIC Educational Resources Information Center

Bowen, Natasha K.; Powers, Joelle D.

2011-01-01

Evidence-based practice and data-driven decision making (DDDM) are two approaches to accountability that have been promoted in the school literature. In spite of the push to promote these approaches in schools, barriers to their widespread, appropriate, and effective use have limited their impact on practice and student outcomes. This article…
Theoretical Framework for Interaction Game Design

DTIC Science & Technology

2016-05-19

modeling. We take a data-driven quantitative approach to understand conversational behaviors by measuring conversational behaviors using advanced sensing...current state of the art, human computing is considered to be a reasonable approach to break through the current limitation. To solicit high quality and...proper resources in conversation to enable smooth and effective interaction. The last technique is about conversation measurement , analysis, and
Improving Forecasts Through Realistic Uncertainty Estimates: A Novel Data Driven Method for Model Uncertainty Quantification in Data Assimilation

NASA Astrophysics Data System (ADS)

Pathiraja, S. D.; Moradkhani, H.; Marshall, L. A.; Sharma, A.; Geenens, G.

2016-12-01

Effective combination of model simulations and observations through Data Assimilation (DA) depends heavily on uncertainty characterisation. Many traditional methods for quantifying model uncertainty in DA require some level of subjectivity (by way of tuning parameters or by assuming Gaussian statistics). Furthermore, the focus is typically on only estimating the first and second moments. We propose a data-driven methodology to estimate the full distributional form of model uncertainty, i.e. the transition density p(xt|xt-1). All sources of uncertainty associated with the model simulations are considered collectively, without needing to devise stochastic perturbations for individual components (such as model input, parameter and structural uncertainty). A training period is used to derive the distribution of errors in observed variables conditioned on hidden states. Errors in hidden states are estimated from the conditional distribution of observed variables using non-linear optimization. The theory behind the framework and case study applications are discussed in detail. Results demonstrate improved predictions and more realistic uncertainty bounds compared to a standard perturbation approach.

Extracting business vocabularies from business process models: SBVR and BPMN standards-based approach

NASA Astrophysics Data System (ADS)

Skersys, Tomas; Butleris, Rimantas; Kapocius, Kestutis

2013-10-01

Approaches for the analysis and specification of business vocabularies and rules are very relevant topics in both Business Process Management and Information Systems Development disciplines. However, in common practice of Information Systems Development, the Business modeling activities still are of mostly empiric nature. In this paper, basic aspects of the approach for business vocabularies' semi-automated extraction from business process models are presented. The approach is based on novel business modeling-level OMG standards "Business Process Model and Notation" (BPMN) and "Semantics for Business Vocabularies and Business Rules" (SBVR), thus contributing to OMG's vision about Model-Driven Architecture (MDA) and to model-driven development in general.
Data-driven parameterization of the generalized Langevin equation

DOE PAGES

Lei, Huan; Baker, Nathan A.; Li, Xiantao

2016-11-29

We present a data-driven approach to determine the memory kernel and random noise of the generalized Langevin equation. To facilitate practical implementations, we parameterize the kernel function in the Laplace domain by a rational function, with coefficients directly linked to the equilibrium statistics of the coarse-grain variables. Further, we show that such an approximation can be constructed to arbitrarily high order. Within these approximations, the generalized Langevin dynamics can be embedded in an extended stochastic model without memory. We demonstrate how to introduce the stochastic noise so that the fluctuation-dissipation theorem is exactly satisfied.
Interpretable Deep Models for ICU Outcome Prediction

PubMed Central

Che, Zhengping; Purushotham, Sanjay; Khemani, Robinder; Liu, Yan

2016-01-01

Exponential surge in health care data, such as longitudinal data from electronic health records (EHR), sensor data from intensive care unit (ICU), etc., is providing new opportunities to discover meaningful data-driven characteristics and patterns ofdiseases. Recently, deep learning models have been employedfor many computational phenotyping and healthcare prediction tasks to achieve state-of-the-art performance. However, deep models lack interpretability which is crucial for wide adoption in medical research and clinical decision-making. In this paper, we introduce a simple yet powerful knowledge-distillation approach called interpretable mimic learning, which uses gradient boosting trees to learn interpretable models and at the same time achieves strong prediction performance as deep learning models. Experiment results on Pediatric ICU dataset for acute lung injury (ALI) show that our proposed method not only outperforms state-of-the-art approaches for morality and ventilator free days prediction tasks but can also provide interpretable models to clinicians. PMID:28269832
The role of data and inference in the development and application of ecological site concepts and state-and-transition models

USDA-ARS?s Scientific Manuscript database

Information embodied in ecological site descriptions and their state-and-transition models is crucial to effective land management, and as such is needed now. There is not time (or money) to employ a traditional research-based approach (i.e., inductive/deductive, hypothesis driven inference) to addr...
Towards predictive models of the human gut microbiome

PubMed Central

2014-01-01

The intestinal microbiota is an ecosystem susceptible to external perturbations such as dietary changes and antibiotic therapies. Mathematical models of microbial communities could be of great value in the rational design of microbiota-tailoring diets and therapies. Here, we discuss how advances in another field, engineering of microbial communities for wastewater treatment bioreactors, could inspire development of mechanistic mathematical models of the gut microbiota. We review the current state-of-the-art in bioreactor modeling and current efforts in modeling the intestinal microbiota. Mathematical modeling could benefit greatly from the deluge of data emerging from metagenomic studies, but data-driven approaches such as network inference that aim to predict microbiome dynamics without explicit mechanistic knowledge seem better suited to model these data. Finally, we discuss how the integration of microbiome shotgun sequencing and metabolic modeling approaches such as flux balance analysis may fulfill the promise of a mechanistic model of the intestinal microbiota. PMID:24727124
Model-Free Estimation of Tuning Curves and Their Attentional Modulation, Based on Sparse and Noisy Data.

PubMed

Helmer, Markus; Kozyrev, Vladislav; Stephan, Valeska; Treue, Stefan; Geisel, Theo; Battaglia, Demian

2016-01-01

Tuning curves are the functions that relate the responses of sensory neurons to various values within one continuous stimulus dimension (such as the orientation of a bar in the visual domain or the frequency of a tone in the auditory domain). They are commonly determined by fitting a model e.g. a Gaussian or other bell-shaped curves to the measured responses to a small subset of discrete stimuli in the relevant dimension. However, as neuronal responses are irregular and experimental measurements noisy, it is often difficult to determine reliably the appropriate model from the data. We illustrate this general problem by fitting diverse models to representative recordings from area MT in rhesus monkey visual cortex during multiple attentional tasks involving complex composite stimuli. We find that all models can be well-fitted, that the best model generally varies between neurons and that statistical comparisons between neuronal responses across different experimental conditions are affected quantitatively and qualitatively by specific model choices. As a robust alternative to an often arbitrary model selection, we introduce a model-free approach, in which features of interest are extracted directly from the measured response data without the need of fitting any model. In our attentional datasets, we demonstrate that data-driven methods provide descriptions of tuning curve features such as preferred stimulus direction or attentional gain modulations which are in agreement with fit-based approaches when a good fit exists. Furthermore, these methods naturally extend to the frequent cases of uncertain model selection. We show that model-free approaches can identify attentional modulation patterns, such as general alterations of the irregular shape of tuning curves, which cannot be captured by fitting stereotyped conventional models. Finally, by comparing datasets across different conditions, we demonstrate effects of attention that are cell- and even stimulus-specific. Based on these proofs-of-concept, we conclude that our data-driven methods can reliably extract relevant tuning information from neuronal recordings, including cells whose seemingly haphazard response curves defy conventional fitting approaches.
Model-Free Estimation of Tuning Curves and Their Attentional Modulation, Based on Sparse and Noisy Data

PubMed Central

Helmer, Markus; Kozyrev, Vladislav; Stephan, Valeska; Treue, Stefan; Geisel, Theo; Battaglia, Demian

2016-01-01

Tuning curves are the functions that relate the responses of sensory neurons to various values within one continuous stimulus dimension (such as the orientation of a bar in the visual domain or the frequency of a tone in the auditory domain). They are commonly determined by fitting a model e.g. a Gaussian or other bell-shaped curves to the measured responses to a small subset of discrete stimuli in the relevant dimension. However, as neuronal responses are irregular and experimental measurements noisy, it is often difficult to determine reliably the appropriate model from the data. We illustrate this general problem by fitting diverse models to representative recordings from area MT in rhesus monkey visual cortex during multiple attentional tasks involving complex composite stimuli. We find that all models can be well-fitted, that the best model generally varies between neurons and that statistical comparisons between neuronal responses across different experimental conditions are affected quantitatively and qualitatively by specific model choices. As a robust alternative to an often arbitrary model selection, we introduce a model-free approach, in which features of interest are extracted directly from the measured response data without the need of fitting any model. In our attentional datasets, we demonstrate that data-driven methods provide descriptions of tuning curve features such as preferred stimulus direction or attentional gain modulations which are in agreement with fit-based approaches when a good fit exists. Furthermore, these methods naturally extend to the frequent cases of uncertain model selection. We show that model-free approaches can identify attentional modulation patterns, such as general alterations of the irregular shape of tuning curves, which cannot be captured by fitting stereotyped conventional models. Finally, by comparing datasets across different conditions, we demonstrate effects of attention that are cell- and even stimulus-specific. Based on these proofs-of-concept, we conclude that our data-driven methods can reliably extract relevant tuning information from neuronal recordings, including cells whose seemingly haphazard response curves defy conventional fitting approaches. PMID:26785378
Moving university hydrology education forward with geoinformatics, data and modeling approaches

NASA Astrophysics Data System (ADS)

Merwade, V.; Ruddell, B. L.

2012-02-01

In this opinion paper, we review recent literature related to data and modeling driven instruction in hydrology, and present our findings from surveying the hydrology education community in the United States. This paper presents an argument that that Data and Modeling Driven Geoscience Cybereducation (DMDGC) approaches are valuable for teaching the conceptual and applied aspects of hydrology, as a part of the broader effort to improve Science, Technology, Engineering, and Mathematics (STEM) education at the university level. The authors have undertaken a series of surveys and a workshop involving the community of university hydrology educators to determine the state of the practice of DMDGC approaches to hydrology. We identify the most common tools and approaches currently utilized, quantify the extent of the adoption of DMDGC approaches in the university hydrology classroom, and explain the community's views on the challenges and barriers preventing DMDGC approaches from wider use. DMDGC approaches are currently emphasized at the graduate level of the curriculum, and only the most basic modeling and visualization tools are in widespread use. The community identifies the greatest barriers to greater adoption as a lack of access to easily adoptable curriculum materials and a lack of time and training to learn constantly changing tools and methods. The community's current consensus is that DMDGC approaches should emphasize conceptual learning, and should be used to complement rather than replace lecture-based pedagogies. Inadequate online material-publication and sharing systems, and a lack of incentives for faculty to develop and publish materials via such systems, is also identified as a challenge. Based on these findings, we suggest that a number of steps should be taken by the community to develop the potential of DMDGC in university hydrology education, including formal development and assessment of curriculum materials integrating lecture-format and DMDGC approaches, incentivizing the publication by faculty of excellent DMDGC curriculum materials, and implementing the publication and dissemination cyberinfrastructure necessary to support the unique DMDGC digital curriculum materials.
Dynamic Data Driven Experiment Control Coordinated with Anisotropic Elastic Material Characterization

Treesearch

John G. Michopoulos; Tomonari Furukawa; John C. Hermanson; Samuel G. Lambrakos

2011-01-01

The goal of this paper is to propose and demonstrate a multi level design optimization approach for the coordinated determination of a material constitutive model synchronously to the design of the experimental procedure needed to acquire the necessary data. The methodology achieves both online (real-time) and offline design of optimum experiments required for...
Functional language and data flow architectures

NASA Technical Reports Server (NTRS)

Ercegovac, M. D.; Patel, D. R.; Lang, T.

1983-01-01

This is a tutorial article about language and architecture approaches for highly concurrent computer systems based on the functional style of programming. The discussion concentrates on the basic aspects of functional languages, and sequencing models such as data-flow, demand-driven and reduction which are essential at the machine organization level. Several examples of highly concurrent machines are described.
Dynamic Emulation Modelling (DEMo) of large physically-based environmental models

NASA Astrophysics Data System (ADS)

Galelli, S.; Castelletti, A.

2012-12-01

In environmental modelling large, spatially-distributed, physically-based models are widely adopted to describe the dynamics of physical, social and economic processes. Such an accurate process characterization comes, however, to a price: the computational requirements of these models are considerably high and prevent their use in any problem requiring hundreds or thousands of model runs to be satisfactory solved. Typical examples include optimal planning and management, data assimilation, inverse modelling and sensitivity analysis. An effective approach to overcome this limitation is to perform a top-down reduction of the physically-based model by identifying a simplified, computationally efficient emulator, constructed from and then used in place of the original model in highly resource-demanding tasks. The underlying idea is that not all the process details in the original model are equally important and relevant to the dynamics of the outputs of interest for the type of problem considered. Emulation modelling has been successfully applied in many environmental applications, however most of the literature considers non-dynamic emulators (e.g. metamodels, response surfaces and surrogate models), where the original dynamical model is reduced to a static map between input and the output of interest. In this study we focus on Dynamic Emulation Modelling (DEMo), a methodological approach that preserves the dynamic nature of the original physically-based model, with consequent advantages in a wide variety of problem areas. In particular, we propose a new data-driven DEMo approach that combines the many advantages of data-driven modelling in representing complex, non-linear relationships, but preserves the state-space representation typical of process-based models, which is both particularly effective in some applications (e.g. optimal management and data assimilation) and facilitates the ex-post physical interpretation of the emulator structure, thus enhancing the credibility of the model to stakeholders and decision-makers. Numerical results from the application of the approach to the reduction of 3D coupled hydrodynamic-ecological models in several real world case studies, including Marina Reservoir (Singapore) and Googong Reservoir (Australia), are illustrated.
A General and Efficient Method for Incorporating Precise Spike Times in Globally Time-Driven Simulations

PubMed Central

Hanuschkin, Alexander; Kunkel, Susanne; Helias, Moritz; Morrison, Abigail; Diesmann, Markus

2010-01-01

Traditionally, event-driven simulations have been limited to the very restricted class of neuronal models for which the timing of future spikes can be expressed in closed form. Recently, the class of models that is amenable to event-driven simulation has been extended by the development of techniques to accurately calculate firing times for some integrate-and-fire neuron models that do not enable the prediction of future spikes in closed form. The motivation of this development is the general perception that time-driven simulations are imprecise. Here, we demonstrate that a globally time-driven scheme can calculate firing times that cannot be discriminated from those calculated by an event-driven implementation of the same model; moreover, the time-driven scheme incurs lower computational costs. The key insight is that time-driven methods are based on identifying a threshold crossing in the recent past, which can be implemented by a much simpler algorithm than the techniques for predicting future threshold crossings that are necessary for event-driven approaches. As run time is dominated by the cost of the operations performed at each incoming spike, which includes spike prediction in the case of event-driven simulation and retrospective detection in the case of time-driven simulation, the simple time-driven algorithm outperforms the event-driven approaches. Additionally, our method is generally applicable to all commonly used integrate-and-fire neuronal models; we show that a non-linear model employing a standard adaptive solver can reproduce a reference spike train with a high degree of precision. PMID:21031031
A data-driven approach for retrieving temperatures and abundances in brown dwarf atmospheres

DOE Office of Scientific and Technical Information (OSTI.GOV)

Line, Michael R.; Fortney, Jonathan J.; Marley, Mark S.

2014-09-20

Brown dwarf spectra contain a wealth of information about their molecular abundances, temperature structure, and gravity. We present a new data driven retrieval approach, previously used in planetary atmosphere studies, to extract the molecular abundances and temperature structure from brown dwarf spectra. The approach makes few a priori physical assumptions about the state of the atmosphere. The feasibility of the approach is first demonstrated on a synthetic brown dwarf spectrum. Given typical spectral resolutions, wavelength coverage, and noise, property precisions of tens of percent can be obtained for the molecular abundances and tens to hundreds of K on the temperaturemore » profile. The technique is then applied to the well-studied brown dwarf, Gl 570D. From this spectral retrieval, the spectroscopic radius is constrained to be 0.75-0.83 R {sub J}, log (g) to be 5.13-5.46, and T {sub eff} to be between 804 and 849 K. Estimates for the range of abundances and allowed temperature profiles are also derived. The results from our retrieval approach are in agreement with the self-consistent grid modeling results of Saumon et al. This new approach will allow us to address issues of compositional differences between brown dwarfs and possibly their formation environments, disequilibrium chemistry, and missing physics in current grid modeling approaches as well as a many other issues.« less
Network Model-Assisted Inference from Respondent-Driven Sampling Data

PubMed Central

Gile, Krista J.; Handcock, Mark S.

2015-01-01

Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328
Network Model-Assisted Inference from Respondent-Driven Sampling Data.

PubMed

Gile, Krista J; Handcock, Mark S

2015-06-01

Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.
Estimating daily forest carbon fluxes using a combination of ground and remotely sensed data

NASA Astrophysics Data System (ADS)

Chirici, Gherardo; Chiesi, Marta; Corona, Piermaria; Salvati, Riccardo; Papale, Dario; Fibbi, Luca; Sirca, Costantino; Spano, Donatella; Duce, Pierpaolo; Marras, Serena; Matteucci, Giorgio; Cescatti, Alessandro; Maselli, Fabio

2016-02-01

Several studies have demonstrated that Monteith's approach can efficiently predict forest gross primary production (GPP), while the modeling of net ecosystem production (NEP) is more critical, requiring the additional simulation of forest respirations. The NEP of different forest ecosystems in Italy was currently simulated by the use of a remote sensing driven parametric model (modified C-Fix) and a biogeochemical model (BIOME-BGC). The outputs of the two models, which simulate forests in quasi-equilibrium conditions, are combined to estimate the carbon fluxes of actual conditions using information regarding the existing woody biomass. The estimates derived from the methodology have been tested against daily reference GPP and NEP data collected through the eddy correlation technique at five study sites in Italy. The first test concerned the theoretical validity of the simulation approach at both annual and daily time scales and was performed using optimal model drivers (i.e., collected or calibrated over the site measurements). Next, the test was repeated to assess the operational applicability of the methodology, which was driven by spatially extended data sets (i.e., data derived from existing wall-to-wall digital maps). A good estimation accuracy was generally obtained for GPP and NEP when using optimal model drivers. The use of spatially extended data sets worsens the accuracy to a varying degree, which is properly characterized. The model drivers with the most influence on the flux modeling strategy are, in increasing order of importance, forest type, soil features, meteorology, and forest woody biomass (growing stock volume).
Automated adaptive inference of phenomenological dynamical models.

PubMed

Daniels, Bryan C; Nemenman, Ilya

2015-08-21

Dynamics of complex systems is often driven by large and intricate networks of microscopic interactions, whose sheer size obfuscates understanding. With limited experimental data, many parameters of such dynamics are unknown, and thus detailed, mechanistic models risk overfitting and making faulty predictions. At the other extreme, simple ad hoc models often miss defining features of the underlying systems. Here we develop an approach that instead constructs phenomenological, coarse-grained models of network dynamics that automatically adapt their complexity to the available data. Such adaptive models produce accurate predictions even when microscopic details are unknown. The approach is computationally tractable, even for a relatively large number of dynamical variables. Using simulated data, it correctly infers the phase space structure for planetary motion, avoids overfitting in a biological signalling system and produces accurate predictions for yeast glycolysis with tens of data points and over half of the interacting species unobserved.
Automated adaptive inference of phenomenological dynamical models

PubMed Central

Daniels, Bryan C.; Nemenman, Ilya

2015-01-01

Dynamics of complex systems is often driven by large and intricate networks of microscopic interactions, whose sheer size obfuscates understanding. With limited experimental data, many parameters of such dynamics are unknown, and thus detailed, mechanistic models risk overfitting and making faulty predictions. At the other extreme, simple ad hoc models often miss defining features of the underlying systems. Here we develop an approach that instead constructs phenomenological, coarse-grained models of network dynamics that automatically adapt their complexity to the available data. Such adaptive models produce accurate predictions even when microscopic details are unknown. The approach is computationally tractable, even for a relatively large number of dynamical variables. Using simulated data, it correctly infers the phase space structure for planetary motion, avoids overfitting in a biological signalling system and produces accurate predictions for yeast glycolysis with tens of data points and over half of the interacting species unobserved. PMID:26293508
The Impact of Heterogeneity and Awareness in Modeling Epidemic Spreading on Multiplex Networks

PubMed Central

Scatà, Marialisa; Di Stefano, Alessandro; Liò, Pietro; La Corte, Aurelio

2016-01-01

In the real world, dynamic processes involving human beings are not disjoint. To capture the real complexity of such dynamics, we propose a novel model of the coevolution of epidemic and awareness spreading processes on a multiplex network, also introducing a preventive isolation strategy. Our aim is to evaluate and quantify the joint impact of heterogeneity and awareness, under different socioeconomic conditions. Considering, as case study, an emerging public health threat, Zika virus, we introduce a data-driven analysis by exploiting multiple sources and different types of data, ranging from Big Five personality traits to Google Trends, related to different world countries where there is an ongoing epidemic outbreak. Our findings demonstrate how the proposed model allows delaying the epidemic outbreak and increasing the resilience of nodes, especially under critical economic conditions. Simulation results, using data-driven approach on Zika virus, which has a growing scientific research interest, are coherent with the proposed analytic model. PMID:27848978
Modeling the Structure of Helical Assemblies with Experimental Constraints in Rosetta.

PubMed

André, Ingemar

2018-01-01

Determining high-resolution structures of proteins with helical symmetry can be challenging due to limitations in experimental data. In such instances, structure-based protein simulations driven by experimental data can provide a valuable approach for building models of helical assemblies. This chapter describes how the Rosetta macromolecular package can be used to model homomeric protein assemblies with helical symmetry in a range of modeling scenarios including energy refinement, symmetrical docking, comparative modeling, and de novo structure prediction. Data-guided structure modeling of helical assemblies with experimental information from electron density, X-ray fiber diffraction, solid-state NMR, and chemical cross-linking mass spectrometry is also described.

Using the Time-Driven Activity-Based Costing Model in the Eye Clinic at The Hospital for Sick Children: A Case Study and Lessons Learned.

PubMed

Gulati, Sanchita; During, David; Mainland, Jeff; Wong, Agnes M F

2018-01-01

One of the key challenges to healthcare organizations is the development of relevant and accurate cost information. In this paper, we used time-driven activity-based costing (TDABC) method to calculate the costs of treating individual patients with specific medical conditions over their full cycle of care. We discussed how TDABC provides a critical, systematic and data-driven approach to estimate costs accurately and dynamically, as well as its potential to enable structural and rational cost reduction to bring about a sustainable healthcare system. © 2018 Longwoods Publishing.
Data-driven discovery of new Dirac semimetal materials

NASA Astrophysics Data System (ADS)

Yan, Qimin; Chen, Ru; Neaton, Jeffrey

In recent years, a significant amount of materials property data from high-throughput computations based on density functional theory (DFT) and the application of database technologies have enabled the rise of data-driven materials discovery. In this work, we initiate the extension of the data-driven materials discovery framework to the realm of topological semimetal materials and to accelerate the discovery of novel Dirac semimetals. We implement current available and develop new workflows to data-mine the Materials Project database for novel Dirac semimetals with desirable band structures and symmetry protected topological properties. This data-driven effort relies on the successful development of several automatic data generation and analysis tools, including a workflow for the automatic identification of topological invariants and pattern recognition techniques to find specific features in a massive number of computed band structures. Utilizing this approach, we successfully identified more than 15 novel Dirac point and Dirac nodal line systems that have not been theoretically predicted or experimentally identified. This work is supported by the Materials Project Predictive Modeling Center through the U.S. Department of Energy, Office of Basic Energy Sciences, Materials Sciences and Engineering Division, under Contract No. DE-AC02-05CH11231.
Formalism Challenges of the Cougaar Model Driven Architecture

NASA Technical Reports Server (NTRS)

Bohner, Shawn A.; George, Boby; Gracanin, Denis; Hinchey, Michael G.

2004-01-01

The Cognitive Agent Architecture (Cougaar) is one of the most sophisticated distributed agent architectures developed today. As part of its research and evolution, Cougaar is being studied for application to large, logistics-based applications for the Department of Defense (DoD). Anticipiting future complex applications of Cougaar, we are investigating the Model Driven Architecture (MDA) approach to understand how effective it would be for increasing productivity in Cougar-based development efforts. Recognizing the sophistication of the Cougaar development environment and the limitations of transformation technologies for agents, we have systematically developed an approach that combines component assembly in the large and transformation in the small. This paper describes some of the key elements that went into the Cougaar Model Driven Architecture approach and the characteristics that drove the approach.
Learning clinically useful information from images: Past, present and future.

PubMed

Rueckert, Daniel; Glocker, Ben; Kainz, Bernhard

2016-10-01

Over the last decade, research in medical imaging has made significant progress in addressing challenging tasks such as image registration and image segmentation. In particular, the use of model-based approaches has been key in numerous, successful advances in methodology. The advantage of model-based approaches is that they allow the incorporation of prior knowledge acting as a regularisation that favours plausible solutions over implausible ones. More recently, medical imaging has moved away from hand-crafted, and often explicitly designed models towards data-driven, implicit models that are constructed using machine learning techniques. This has led to major improvements in all stages of the medical imaging pipeline, from acquisition and reconstruction to analysis and interpretation. As more and more imaging data is becoming available, e.g., from large population studies, this trend is likely to continue and accelerate. At the same time new developments in machine learning, e.g., deep learning, as well as significant improvements in computing power, e.g., parallelisation on graphics hardware, offer new potential for data-driven, semantic and intelligent medical imaging. This article outlines the work of the BioMedIA group in this area and highlights some of the challenges and opportunities for future work. Copyright © 2016 Elsevier B.V. All rights reserved.
Stepwise group sparse regression (SGSR): gene-set-based pharmacogenomic predictive models with stepwise selection of functional priors.

PubMed

Jang, In Sock; Dienstmann, Rodrigo; Margolin, Adam A; Guinney, Justin

2015-01-01

Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.
Moving university hydrology education forward with community-based geoinformatics, data and modeling resources

NASA Astrophysics Data System (ADS)

Merwade, V.; Ruddell, B. L.

2012-08-01

In this opinion paper, we review recent literature related to data and modeling driven instruction in hydrology, and present our findings from surveying the hydrology education community in the United States. This paper presents an argument that that data and modeling driven geoscience cybereducation (DMDGC) approaches are essential for teaching the conceptual and applied aspects of hydrology, as a part of the broader effort to improve science, technology, engineering, and mathematics (STEM) education at the university level. The authors have undertaken a series of surveys and a workshop involving university hydrology educators to determine the state of the practice of DMDGC approaches to hydrology. We identify the most common tools and approaches currently utilized, quantify the extent of the adoption of DMDGC approaches in the university hydrology classroom, and explain the community's views on the challenges and barriers preventing DMDGC approaches from wider use. DMDGC approaches are currently emphasized at the graduate level of the curriculum, and only the most basic modeling and visualization tools are in widespread use. The community identifies the greatest barriers to greater adoption as a lack of access to easily adoptable curriculum materials and a lack of time and training to learn constantly changing tools and methods. The community's current consensus is that DMDGC approaches should emphasize conceptual learning, and should be used to complement rather than replace lecture-based pedagogies. Inadequate online material publication and sharing systems, and a lack of incentives for faculty to develop and publish materials via such systems, is also identified as a challenge. Based on these findings, we suggest that a number of steps should be taken by the community to develop the potential of DMDGC in university hydrology education, including formal development and assessment of curriculum materials, integrating lecture-format and DMDGC approaches, incentivizing the publication by faculty of excellent DMDGC curriculum materials, and implementing the publication and dissemination cyberinfrastructure necessary to support the unique DMDGC digital curriculum materials.
Assumption-versus data-based approaches to summarizing species' ranges.

PubMed

Peterson, A Townsend; Navarro-Sigüenza, Adolfo G; Gordillo, Alejandro

2018-06-01

For conservation decision making, species' geographic distributions are mapped using various approaches. Some such efforts have downscaled versions of coarse-resolution extent-of-occurrence maps to fine resolutions for conservation planning. We examined the quality of the extent-of-occurrence maps as range summaries and the utility of refining those maps into fine-resolution distributional hypotheses. Extent-of-occurrence maps tend to be overly simple, omit many known and well-documented populations, and likely frequently include many areas not holding populations. Refinement steps involve typological assumptions about habitat preferences and elevational ranges of species, which can introduce substantial error in estimates of species' true areas of distribution. However, no model-evaluation steps are taken to assess the predictive ability of these models, so model inaccuracies are not noticed. Whereas range summaries derived by these methods may be useful in coarse-grained, global-extent studies, their continued use in on-the-ground conservation applications at fine spatial resolutions is not advisable in light of reliance on assumptions, lack of real spatial resolution, and lack of testing. In contrast, data-driven techniques that integrate primary data on biodiversity occurrence with remotely sensed data that summarize environmental dimensions (i.e., ecological niche modeling or species distribution modeling) offer data-driven solutions based on a minimum of assumptions that can be evaluated and validated quantitatively to offer a well-founded, widely accepted method for summarizing species' distributional patterns for conservation applications. © 2016 Society for Conservation Biology.
The "Village" Model: A Consumer-Driven Approach for Aging in Place

ERIC Educational Resources Information Center

Scharlach, Andrew; Graham, Carrie; Lehning, Amanda

2012-01-01

Purpose of the Study: This study examines the characteristics of the "Village" model, an innovative consumer-driven approach that aims to promote aging in place through a combination of member supports, service referrals, and consumer engagement. Design and Methods: Thirty of 42 fully operational Villages completed 2 surveys. One survey examined…
Data-driven approaches in the investigation of social perception

PubMed Central

Adolphs, Ralph; Nummenmaa, Lauri; Todorov, Alexander; Haxby, James V.

2016-01-01

The complexity of social perception poses a challenge to traditional approaches to understand its psychological and neurobiological underpinnings. Data-driven methods are particularly well suited to tackling the often high-dimensional nature of stimulus spaces and of neural representations that characterize social perception. Such methods are more exploratory, capitalize on rich and large datasets, and attempt to discover patterns often without strict hypothesis testing. We present four case studies here: behavioural studies on face judgements, two neuroimaging studies of movies, and eyetracking studies in autism. We conclude with suggestions for particular topics that seem ripe for data-driven approaches, as well as caveats and limitations. PMID:27069045
A data-driven soft sensor for needle deflection in heterogeneous tissue using just-in-time modelling.

PubMed

Rossa, Carlos; Lehmann, Thomas; Sloboda, Ronald; Usmani, Nawaid; Tavakoli, Mahdi

2017-08-01

Global modelling has traditionally been the approach taken to estimate needle deflection in soft tissue. In this paper, we propose a new method based on local data-driven modelling of needle deflection. External measurement of needle-tissue interactions is collected from several insertions in ex vivo tissue to form a cloud of data. Inputs to the system are the needle insertion depth, axial rotations, and the forces and torques measured at the needle base by a force sensor. When a new insertion is performed, the just-in-time learning method estimates the model outputs given the current inputs to the needle-tissue system and the historical database. The query is compared to every observation in the database and is given weights according to some similarity criteria. Only a subset of historical data that is most relevant to the query is selected and a local linear model is fit to the selected points to estimate the query output. The model outputs the 3D deflection of the needle tip and the needle insertion force. The proposed approach is validated in ex vivo multilayered biological tissue in different needle insertion scenarios. Experimental results in five different case studies indicate an accuracy in predicting needle deflection of 0.81 and 1.24 mm in the horizontal and vertical lanes, respectively, and an accuracy of 0.5 N in predicting the needle insertion force over 216 needle insertions.
Gaussian Processes for Data-Efficient Learning in Robotics and Control.

PubMed

Deisenroth, Marc Peter; Fox, Dieter; Rasmussen, Carl Edward

2015-02-01

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.
Quantifying allometric model uncertainty for plot-level live tree biomass stocks with a data-driven, hierarchical framework

Treesearch

Brian J. Clough; Matthew B. Russell; Grant M. Domke; Christopher W. Woodall

2016-01-01

Accurate uncertainty assessments of plot-level live tree biomass stocks are an important precursor to estimating uncertainty in annual national greenhouse gas inventories (NGHGIs) developed from forest inventory data. However, current approaches employed within the United Statesâ NGHGI do not specifically incorporate methods to address error in tree-scale biomass...
Climate-driven vital rates do not always mean climate-driven population.

PubMed

Tavecchia, Giacomo; Tenan, Simone; Pradel, Roger; Igual, José-Manuel; Genovart, Meritxell; Oro, Daniel

2016-12-01

Current climatic changes have increased the need to forecast population responses to climate variability. A common approach to address this question is through models that project current population state using the functional relationship between demographic rates and climatic variables. We argue that this approach can lead to erroneous conclusions when interpopulation dispersal is not considered. We found that immigration can release the population from climate-driven trajectories even when local vital rates are climate dependent. We illustrated this using individual-based data on a trans-equatorial migratory seabird, the Scopoli's shearwater Calonectris diomedea, in which the variation of vital rates has been associated with large-scale climatic indices. We compared the population annual growth rate λ i , estimated using local climate-driven parameters with ρ i , a population growth rate directly estimated from individual information and that accounts for immigration. While λ i varied as a function of climatic variables, reflecting the climate-dependent parameters, ρ i did not, indicating that dispersal decouples the relationship between population growth and climate variables from that between climatic variables and vital rates. Our results suggest caution when assessing demographic effects of climatic variability especially in open populations for very mobile organisms such as fish, marine mammals, bats, or birds. When a population model cannot be validated or it is not detailed enough, ignoring immigration might lead to misleading climate-driven projections. © 2016 John Wiley & Sons Ltd.
The eGo grid model: An open source approach towards a model of German high and extra-high voltage power grids

NASA Astrophysics Data System (ADS)

Mueller, Ulf Philipp; Wienholt, Lukas; Kleinhans, David; Cussmann, Ilka; Bunke, Wolf-Dieter; Pleßmann, Guido; Wendiggensen, Jochen

2018-02-01

There are several power grid modelling approaches suitable for simulations in the field of power grid planning. The restrictive policies of grid operators, regulators and research institutes concerning their original data and models lead to an increased interest in open source approaches of grid models based on open data. By including all voltage levels between 60 kV (high voltage) and 380kV (extra high voltage), we dissolve the common distinction between transmission and distribution grid in energy system models and utilize a single, integrated model instead. An open data set for primarily Germany, which can be used for non-linear, linear and linear-optimal power flow methods, was developed. This data set consists of an electrically parameterised grid topology as well as allocated generation and demand characteristics for present and future scenarios at high spatial and temporal resolution. The usability of the grid model was demonstrated by the performance of exemplary power flow optimizations. Based on a marginal cost driven power plant dispatch, being subject to grid restrictions, congested power lines were identified. Continuous validation of the model is nescessary in order to reliably model storage and grid expansion in progressing research.
Set-membership fault detection under noisy environment with application to the detection of abnormal aircraft control surface positions

NASA Astrophysics Data System (ADS)

El Houda Thabet, Rihab; Combastel, Christophe; Raïssi, Tarek; Zolghadri, Ali

2015-09-01

The paper develops a set membership detection methodology which is applied to the detection of abnormal positions of aircraft control surfaces. Robust and early detection of such abnormal positions is an important issue for early system reconfiguration and overall optimisation of aircraft design. In order to improve fault sensitivity while ensuring a high level of robustness, the method combines a data-driven characterisation of noise and a model-driven approach based on interval prediction. The efficiency of the proposed methodology is illustrated through simulation results obtained based on data recorded in several flight scenarios of a highly representative aircraft benchmark.
Big data to smart data in Alzheimer's disease: The brain health modeling initiative to foster actionable knowledge.

PubMed

Geerts, Hugo; Dacks, Penny A; Devanarayan, Viswanath; Haas, Magali; Khachaturian, Zaven S; Gordon, Mark Forrest; Maudsley, Stuart; Romero, Klaus; Stephenson, Diane

2016-09-01

Massive investment and technological advances in the collection of extensive and longitudinal information on thousands of Alzheimer patients results in large amounts of data. These "big-data" databases can potentially advance CNS research and drug development. However, although necessary, they are not sufficient, and we posit that they must be matched with analytical methods that go beyond retrospective data-driven associations with various clinical phenotypes. Although these empirically derived associations can generate novel and useful hypotheses, they need to be organically integrated in a quantitative understanding of the pathology that can be actionable for drug discovery and development. We argue that mechanism-based modeling and simulation approaches, where existing domain knowledge is formally integrated using complexity science and quantitative systems pharmacology can be combined with data-driven analytics to generate predictive actionable knowledge for drug discovery programs, target validation, and optimization of clinical development. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Switching Kalman filter for failure prognostic

NASA Astrophysics Data System (ADS)

Lim, Chi Keong Reuben; Mba, David

2015-02-01

The use of condition monitoring (CM) data to predict remaining useful life have been growing with increasing use of health and usage monitoring systems on aircraft. There are many data-driven methodologies available for the prediction and popular ones include artificial intelligence and statistical based approach. The drawback of such approaches is that they require a lot of failure data for training which can be scarce in practice. In lieu of this, methods using state-space and regression-based models that extract information from the data history itself have been explored. However, such methods have their own limitations as they utilize a single time-invariant model which does not represent changing degradation path well. This causes most degradation modeling studies to focus only on segments of their CM data that behaves close to the assumed model. In this paper, a state-space based method; the Switching Kalman Filter (SKF), is adopted for model estimation and life prediction. The SKF approach however, uses multiple models from which the most probable model is inferred from the CM data using Bayesian estimation before it is applied for prediction. At the same time, the inference of the degradation model itself can provide maintainers with more information for their planning. This SKF approach is demonstrated with a case study on gearbox bearings that were found defective from the Republic of Singapore Air Force AH64D helicopter. The use of in-service CM data allows the approach to be applied in a practical scenario and results showed that the developed SKF approach is a promising tool to support maintenance decision-making.
A data-driven prediction method for fast-slow systems

NASA Astrophysics Data System (ADS)

Groth, Andreas; Chekroun, Mickael; Kondrashov, Dmitri; Ghil, Michael

2016-04-01

In this work, we present a prediction method for processes that exhibit a mixture of variability on low and fast scales. The method relies on combining empirical model reduction (EMR) with singular spectrum analysis (SSA). EMR is a data-driven methodology for constructing stochastic low-dimensional models that account for nonlinearity and serial correlation in the estimated noise, while SSA provides a decomposition of the complex dynamics into low-order components that capture spatio-temporal behavior on different time scales. Our study focuses on the data-driven modeling of partial observations from dynamical systems that exhibit power spectra with broad peaks. The main result in this talk is that the combination of SSA pre-filtering with EMR modeling improves, under certain circumstances, the modeling and prediction skill of such a system, as compared to a standard EMR prediction based on raw data. Specifically, it is the separation into "fast" and "slow" temporal scales by the SSA pre-filtering that achieves the improvement. We show, in particular that the resulting EMR-SSA emulators help predict intermittent behavior such as rapid transitions between specific regions of the system's phase space. This capability of the EMR-SSA prediction will be demonstrated on two low-dimensional models: the Rössler system and a Lotka-Volterra model for interspecies competition. In either case, the chaotic dynamics is produced through a Shilnikov-type mechanism and we argue that the latter seems to be an important ingredient for the good prediction skills of EMR-SSA emulators. Shilnikov-type behavior has been shown to arise in various complex geophysical fluid models, such as baroclinic quasi-geostrophic flows in the mid-latitude atmosphere and wind-driven double-gyre ocean circulation models. This pervasiveness of the Shilnikow mechanism of fast-slow transition opens interesting perspectives for the extension of the proposed EMR-SSA approach to more realistic situations.
Integrated Workforce Planning Model: A Proof of Concept

NASA Technical Reports Server (NTRS)

Guruvadoo, Eranna K.

2001-01-01

Recently, the Workforce and Diversity Management Office at KSC have launched a major initiative to develop and implement a competency/skill approach to Human Resource management. As the competency/skill dictionary is being elaborated, the need for a competency-based workforce-planning model is recognized. A proof of concept for such a model is presented using a multidimensional data model that can provide the data infrastructure necessary to drive intelligent decision support systems for workforce planing. The components of competency-driven workforce planning model are explained. The data model is presented and several schemes that would support the workforce-planning model are presented. Some directions and recommendations for future work are given.
Data-Driven Model Uncertainty Estimation in Hydrologic Data Assimilation

NASA Astrophysics Data System (ADS)

Pathiraja, S.; Moradkhani, H.; Marshall, L.; Sharma, A.; Geenens, G.

2018-02-01

The increasing availability of earth observations necessitates mathematical methods to optimally combine such data with hydrologic models. Several algorithms exist for such purposes, under the umbrella of data assimilation (DA). However, DA methods are often applied in a suboptimal fashion for complex real-world problems, due largely to several practical implementation issues. One such issue is error characterization, which is known to be critical for a successful assimilation. Mischaracterized errors lead to suboptimal forecasts, and in the worst case, to degraded estimates even compared to the no assimilation case. Model uncertainty characterization has received little attention relative to other aspects of DA science. Traditional methods rely on subjective, ad hoc tuning factors or parametric distribution assumptions that may not always be applicable. We propose a novel data-driven approach (named SDMU) to model uncertainty characterization for DA studies where (1) the system states are partially observed and (2) minimal prior knowledge of the model error processes is available, except that the errors display state dependence. It includes an approach for estimating the uncertainty in hidden model states, with the end goal of improving predictions of observed variables. The SDMU is therefore suited to DA studies where the observed variables are of primary interest. Its efficacy is demonstrated through a synthetic case study with low-dimensional chaotic dynamics and a real hydrologic experiment for one-day-ahead streamflow forecasting. In both experiments, the proposed method leads to substantial improvements in the hidden states and observed system outputs over a standard method involving perturbation with Gaussian noise.

A new concept for simulation of vegetated land surface dynamics - Part 1: The event driven phenology model

NASA Astrophysics Data System (ADS)

Kovalskyy, V.; Henebry, G. M.

2012-01-01

Phenologies of the vegetated land surface are being used increasingly for diagnosis and prognosis of climate change consequences. Current prospective and retrospective phenological models stand far apart in their approaches to the subject. We report on an exploratory attempt to implement a phenological model based on a new event driven concept which has both diagnostic and prognostic capabilities in the same modeling framework. This Event Driven Phenological Model (EDPM) is shown to simulate land surface phenologies and phenophase transition dates in agricultural landscapes based on assimilation of weather data and land surface observations from spaceborne sensors. The model enables growing season phenologies to develop in response to changing environmental conditions and disturbance events. It also has the ability to ingest remotely sensed data to adjust its output to improve representation of the modeled variable. We describe the model and report results of initial testing of the EDPM using Level 2 flux tower records from the Ameriflux sites at Mead, Nebraska, USA, and at Bondville, Illinois, USA. Simulating the dynamics of normalized difference vegetation index based on flux tower data, the predictions by the EDPM show good agreement (RMSE < 0.08; r2 > 0.8) for maize and soybean during several growing seasons at different locations. This study presents the EDPM used in the companion paper (Kovalskyy and Henebry, 2011) in a coupling scheme to estimate daily actual evapotranspiration over multiple growing seasons.
A new concept for simulation of vegetated land surface dynamics - Part 1: The event driven phenology model

NASA Astrophysics Data System (ADS)

Kovalskyy, V.; Henebry, G. M.

2011-05-01

Phenologies of the vegetated land surface are being used increasingly for diagnosis and prognosis of climate change consequences. Current prospective and retrospective phenological models stand far apart in their approaches to the subject. We report on an exploratory attempt to implement a phenological model based on a new event driven concept which has both diagnostic and prognostic capabilities in the same modeling framework. This Event Driven Phenological Model (EDPM) is shown to simulate land surface phenologies and phenophase transition dates in agricultural landscapes based on assimilation of weather data and land surface observations from spaceborne sensors. The model enables growing season phenologies to develop in response to changing environmental conditions and disturbance events. It also has the ability to ingest remotely sensed data to adjust its output to improve representation of the modeled variable. We describe the model and report results of initial testing of the EDPM using Level 2 flux tower records from the Ameriflux sites at Mead, Nebraska, USA, and at Bondville, Illinois, USA. Simulating the dynamics of normalized difference vegetation index based on flux tower data, the predictions by the EDPM show good agreement (RMSE < 0.08; r2>0.8) for maize and soybean during several growing seasons at different locations. This study presents the EDPM used in the companion paper (Kovalskyy and Henebry, 2011) in a coupling scheme to estimate daily actual evapotranspiration over multiple growing seasons.
Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma.

PubMed

Caldas, José; Gehlenborg, Nils; Kettunen, Eeva; Faisal, Ali; Rönty, Mikko; Nicholson, Andrew G; Knuutila, Sakari; Brazma, Alvis; Kaski, Samuel

2012-01-15

Genome-wide measurement of transcript levels is an ubiquitous tool in biomedical research. As experimental data continues to be deposited in public databases, it is becoming important to develop search engines that enable the retrieval of relevant studies given a query study. While retrieval systems based on meta-data already exist, data-driven approaches that retrieve studies based on similarities in the expression data itself have a greater potential of uncovering novel biological insights. We propose an information retrieval method based on differential expression. Our method deals with arbitrary experimental designs and performs competitively with alternative approaches, while making the search results interpretable in terms of differential expression patterns. We show that our model yields meaningful connections between biological conditions from different studies. Finally, we validate a previously unknown connection between malignant pleural mesothelioma and SIM2s suggested by our method, via real-time polymerase chain reaction in an independent set of mesothelioma samples. Supplementary data and source code are available from http://www.ebi.ac.uk/fg/research/rex.
Defining datasets and creating data dictionaries for quality improvement and research in chronic disease using routinely collected data: an ontology-driven approach.

PubMed

de Lusignan, Simon; Liaw, Siaw-Teng; Michalakidis, Georgios; Jones, Simon

2011-01-01

The burden of chronic disease is increasing, and research and quality improvement will be less effective if case finding strategies are suboptimal. To describe an ontology-driven approach to case finding in chronic disease and how this approach can be used to create a data dictionary and make the codes used in case finding transparent. A five-step process: (1) identifying a reference coding system or terminology; (2) using an ontology-driven approach to identify cases; (3) developing metadata that can be used to identify the extracted data; (4) mapping the extracted data to the reference terminology; and (5) creating the data dictionary. Hypertension is presented as an exemplar. A patient with hypertension can be represented by a range of codes including diagnostic, history and administrative. Metadata can link the coding system and data extraction queries to the correct data mapping and translation tool, which then maps it to the equivalent code in the reference terminology. The code extracted, the term, its domain and subdomain, and the name of the data extraction query can then be automatically grouped and published online as a readily searchable data dictionary. An exemplar online is: www.clininf.eu/qickd-data-dictionary.html Adopting an ontology-driven approach to case finding could improve the quality of disease registers and of research based on routine data. It would offer considerable advantages over using limited datasets to define cases. This approach should be considered by those involved in research and quality improvement projects which utilise routine data.
High Performance Visualization using Query-Driven Visualizationand Analytics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bethel, E. Wes; Campbell, Scott; Dart, Eli

2006-06-15

Query-driven visualization and analytics is a unique approach for high-performance visualization that offers new capabilities for knowledge discovery and hypothesis testing. The new capabilities akin to finding needles in haystacks are the result of combining technologies from the fields of scientific visualization and scientific data management. This approach is crucial for rapid data analysis and visualization in the petascale regime. This article describes how query-driven visualization is applied to a hero-sized network traffic analysis problem.
On the data-driven inference of modulatory networks in climate science: an application to West African rainfall

NASA Astrophysics Data System (ADS)

González, D. L., II; Angus, M. P.; Tetteh, I. K.; Bello, G. A.; Padmanabhan, K.; Pendse, S. V.; Srinivas, S.; Yu, J.; Semazzi, F.; Kumar, V.; Samatova, N. F.

2014-04-01

Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression, and Dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall, including well-known associations from prior climate knowledge, as well as promising discoveries that invite further research by the climate science community.
The dynamics of information-driven coordination phenomena: A transfer entropy analysis

PubMed Central

Borge-Holthoefer, Javier; Perra, Nicola; Gonçalves, Bruno; González-Bailón, Sandra; Arenas, Alex; Moreno, Yamir; Vespignani, Alessandro

2016-01-01

Data from social media provide unprecedented opportunities to investigate the processes that govern the dynamics of collective social phenomena. We consider an information theoretical approach to define and measure the temporal and structural signatures typical of collective social events as they arise and gain prominence. We use the symbolic transfer entropy analysis of microblogging time series to extract directed networks of influence among geolocalized subunits in social systems. This methodology captures the emergence of system-level dynamics close to the onset of socially relevant collective phenomena. The framework is validated against a detailed empirical analysis of five case studies. In particular, we identify a change in the characteristic time scale of the information transfer that flags the onset of information-driven collective phenomena. Furthermore, our approach identifies an order-disorder transition in the directed network of influence between social subunits. In the absence of clear exogenous driving, social collective phenomena can be represented as endogenously driven structural transitions of the information transfer network. This study provides results that can help define models and predictive algorithms for the analysis of societal events based on open source data. PMID:27051875
The dynamics of information-driven coordination phenomena: A transfer entropy analysis.

PubMed

Borge-Holthoefer, Javier; Perra, Nicola; Gonçalves, Bruno; González-Bailón, Sandra; Arenas, Alex; Moreno, Yamir; Vespignani, Alessandro

2016-04-01

Data from social media provide unprecedented opportunities to investigate the processes that govern the dynamics of collective social phenomena. We consider an information theoretical approach to define and measure the temporal and structural signatures typical of collective social events as they arise and gain prominence. We use the symbolic transfer entropy analysis of microblogging time series to extract directed networks of influence among geolocalized subunits in social systems. This methodology captures the emergence of system-level dynamics close to the onset of socially relevant collective phenomena. The framework is validated against a detailed empirical analysis of five case studies. In particular, we identify a change in the characteristic time scale of the information transfer that flags the onset of information-driven collective phenomena. Furthermore, our approach identifies an order-disorder transition in the directed network of influence between social subunits. In the absence of clear exogenous driving, social collective phenomena can be represented as endogenously driven structural transitions of the information transfer network. This study provides results that can help define models and predictive algorithms for the analysis of societal events based on open source data.
Fault Diagnosis in HVAC Chillers

NASA Technical Reports Server (NTRS)

Choi, Kihoon; Namuru, Setu M.; Azam, Mohammad S.; Luo, Jianhui; Pattipati, Krishna R.; Patterson-Hine, Ann

2005-01-01

Modern buildings are being equipped with increasingly sophisticated power and control systems with substantial capabilities for monitoring and controlling the amenities. Operational problems associated with heating, ventilation, and air-conditioning (HVAC) systems plague many commercial buildings, often the result of degraded equipment, failed sensors, improper installation, poor maintenance, and improperly implemented controls. Most existing HVAC fault-diagnostic schemes are based on analytical models and knowledge bases. These schemes are adequate for generic systems. However, real-world systems significantly differ from the generic ones and necessitate modifications of the models and/or customization of the standard knowledge bases, which can be labor intensive. Data-driven techniques for fault detection and isolation (FDI) have a close relationship with pattern recognition, wherein one seeks to categorize the input-output data into normal or faulty classes. Owing to the simplicity and adaptability, customization of a data-driven FDI approach does not require in-depth knowledge of the HVAC system. It enables the building system operators to improve energy efficiency and maintain the desired comfort level at a reduced cost. In this article, we consider a data-driven approach for FDI of chillers in HVAC systems. To diagnose the faults of interest in the chiller, we employ multiway dynamic principal component analysis (MPCA), multiway partial least squares (MPLS), and support vector machines (SVMs). The simulation of a chiller under various fault conditions is conducted using a standard chiller simulator from the American Society of Heating, Refrigerating, and Air-conditioning Engineers (ASHRAE). We validated our FDI scheme using experimental data obtained from different types of chiller faults.
Modeling the sustainable development of innovation in transport construction based on the communication approach

NASA Astrophysics Data System (ADS)

Revunova, Svetlana; Vlasenko, Vyacheslav; Bukreev, Anatoly

2017-10-01

The article proposes the models of innovative activity development, which is driven by the formation of “points of innovation-driven growth”. The models are based on the analysis of the current state and dynamics of innovative development of construction enterprises in the transport sector and take into account a number of essential organizational and economic changes in management. The authors substantiate implementing such development models as an organizational innovation that has a communication genesis. The use of the communication approach to the formation of “points of innovation-driven growth” allowed the authors to apply the mathematical tools of the graph theory in order to activate the innovative activity of the transport industry in the region. As a result, the authors have proposed models that allow constructing an optimal mechanism for the formation of “points of innovation-driven growth”.
Systematic Analysis of Challenge-Driven Improvements in Molecular Prognostic Models for Breast Cancer

PubMed Central

Margolin, Adam A.; Bilal, Erhan; Huang, Erich; Norman, Thea C.; Ottestad, Lars; Mecham, Brigham H.; Sauerwine, Ben; Kellen, Michael R.; Mangravite, Lara M.; Furia, Matthew D.; Vollan, Hans Kristian Moen; Rueda, Oscar M.; Guinney, Justin; Deflaux, Nicole A.; Hoff, Bruce; Schildwachter, Xavier; Russnes, Hege G.; Park, Daehoon; Vang, Veronica O.; Pirtle, Tyler; Youseff, Lamia; Citro, Craig; Curtis, Christina; Kristensen, Vessela N.; Hellerstein, Joseph; Friend, Stephen H.; Stolovitzky, Gustavo; Aparicio, Samuel; Caldas, Carlos; Børresen-Dale, Anne-Lise

2013-01-01

Although molecular prognostics in breast cancer are among the most successful examples of translating genomic analysis to clinical applications, optimal approaches to breast cancer clinical risk prediction remain controversial. The Sage Bionetworks–DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced research study for breast cancer prognostic modeling using genome-scale data. The BCC provided a community of data analysts with a common platform for data access and blinded evaluation of model accuracy in predicting breast cancer survival on the basis of gene expression data, copy number data, and clinical covariates. This approach offered the opportunity to assess whether a crowdsourced community Challenge would generate models of breast cancer prognosis commensurate with or exceeding current best-in-class approaches. The BCC comprised multiple rounds of blinded evaluations on held-out portions of data on 1981 patients, resulting in more than 1400 models submitted as open source code. Participants then retrained their models on the full data set of 1981 samples and submitted up to five models for validation in a newly generated data set of 184 breast cancer patients. Analysis of the BCC results suggests that the best-performing modeling strategy outperformed previously reported methods in blinded evaluations; model performance was consistent across several independent evaluations; and aggregating community-developed models achieved performance on par with the best-performing individual models. PMID:23596205
Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment

PubMed Central

Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis

2018-01-01

This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks. PMID:29762505
Features of Cross-Correlation Analysis in a Data-Driven Approach for Structural Damage Assessment.

PubMed

Camacho Navarro, Jhonatan; Ruiz, Magda; Villamizar, Rodolfo; Mujica, Luis; Quiroga, Jabid

2018-05-15

This work discusses the advantage of using cross-correlation analysis in a data-driven approach based on principal component analysis (PCA) and piezodiagnostics to obtain successful diagnosis of events in structural health monitoring (SHM). In this sense, the identification of noisy data and outliers, as well as the management of data cleansing stages can be facilitated through the implementation of a preprocessing stage based on cross-correlation functions. Additionally, this work evidences an improvement in damage detection when the cross-correlation is included as part of the whole damage assessment approach. The proposed methodology is validated by processing data measurements from piezoelectric devices (PZT), which are used in a piezodiagnostics approach based on PCA and baseline modeling. Thus, the influence of cross-correlation analysis used in the preprocessing stage is evaluated for damage detection by means of statistical plots and self-organizing maps. Three laboratory specimens were used as test structures in order to demonstrate the validity of the methodology: (i) a carbon steel pipe section with leak and mass damage types, (ii) an aircraft wing specimen, and (iii) a blade of a commercial aircraft turbine, where damages are specified as mass-added. As the main concluding remark, the suitability of cross-correlation features combined with a PCA-based piezodiagnostic approach in order to achieve a more robust damage assessment algorithm is verified for SHM tasks.
A manifold learning approach to data-driven computational materials and processes

NASA Astrophysics Data System (ADS)

Ibañez, Ruben; Abisset-Chavanne, Emmanuelle; Aguado, Jose Vicente; Gonzalez, David; Cueto, Elias; Duval, Jean Louis; Chinesta, Francisco

2017-10-01

Standard simulation in classical mechanics is based on the use of two very different types of equations. The first one, of axiomatic character, is related to balance laws (momentum, mass, energy, …), whereas the second one consists of models that scientists have extracted from collected, natural or synthetic data. In this work we propose a new method, able to directly link data to computers in order to perform numerical simulations. These simulations will employ universal laws while minimizing the need of explicit, often phenomenological, models. They are based on manifold learning methodologies.
A data-driven modeling approach to stochastic computation for low-energy biomedical devices.

PubMed

Lee, Kyong Ho; Jang, Kuk Jin; Shoeb, Ali; Verma, Naveen

2011-01-01

Low-power devices that can detect clinically relevant correlations in physiologically-complex patient signals can enable systems capable of closed-loop response (e.g., controlled actuation of therapeutic stimulators, continuous recording of disease states, etc.). In ultra-low-power platforms, however, hardware error sources are becoming increasingly limiting. In this paper, we present how data-driven methods, which allow us to accurately model physiological signals, also allow us to effectively model and overcome prominent hardware error sources with nearly no additional overhead. Two applications, EEG-based seizure detection and ECG-based arrhythmia-beat classification, are synthesized to a logic-gate implementation, and two prominent error sources are introduced: (1) SRAM bit-cell errors and (2) logic-gate switching errors ('stuck-at' faults). Using patient data from the CHB-MIT and MIT-BIH databases, performance similar to error-free hardware is achieved even for very high fault rates (up to 0.5 for SRAMs and 7 × 10(-2) for logic) that cause computational bit error rates as high as 50%.
Reconstruction of fire regimes through integrated paleoecological proxy data and ecological modeling.

PubMed

Iglesias, Virginia; Yospin, Gabriel I; Whitlock, Cathy

2014-01-01

Fire is a key ecological process affecting vegetation dynamics and land cover. The characteristic frequency, size, and intensity of fire are driven by interactions between top-down climate-driven and bottom-up fuel-related processes. Disentangling climatic from non-climatic drivers of past fire regimes is a grand challenge in Earth systems science, and a topic where both paleoecology and ecological modeling have made substantial contributions. In this manuscript, we (1) review the use of sedimentary charcoal as a fire proxy and the methods used in charcoal-based fire history reconstructions; (2) identify existing techniques for paleoecological modeling; and (3) evaluate opportunities for coupling of paleoecological and ecological modeling approaches to better understand the causes and consequences of past, present, and future fire activity.
Reconstruction of fire regimes through integrated paleoecological proxy data and ecological modeling

PubMed Central

Iglesias, Virginia; Yospin, Gabriel I.; Whitlock, Cathy

2015-01-01

Fire is a key ecological process affecting vegetation dynamics and land cover. The characteristic frequency, size, and intensity of fire are driven by interactions between top-down climate-driven and bottom-up fuel-related processes. Disentangling climatic from non-climatic drivers of past fire regimes is a grand challenge in Earth systems science, and a topic where both paleoecology and ecological modeling have made substantial contributions. In this manuscript, we (1) review the use of sedimentary charcoal as a fire proxy and the methods used in charcoal-based fire history reconstructions; (2) identify existing techniques for paleoecological modeling; and (3) evaluate opportunities for coupling of paleoecological and ecological modeling approaches to better understand the causes and consequences of past, present, and future fire activity. PMID:25657652
Global retrieval of soil moisture and vegetation properties using data-driven methods

NASA Astrophysics Data System (ADS)

Rodriguez-Fernandez, Nemesio; Richaume, Philippe; Kerr, Yann

2017-04-01

Data-driven methods such as neural networks (NNs) are a powerful tool to retrieve soil moisture from multi-wavelength remote sensing observations at global scale. In this presentation we will review a number of recent results regarding the retrieval of soil moisture with the Soil Moisture and Ocean Salinity (SMOS) satellite, either using SMOS brightness temperatures as input data for the retrieval or using SMOS soil moisture retrievals as reference dataset for the training. The presentation will discuss several possibilities for both the input datasets and the datasets to be used as reference for the supervised learning phase. Regarding the input datasets, it will be shown that NNs take advantage of the synergy of SMOS data and data from other sensors such as the Advanced Scatterometer (ASCAT, active microwaves) and MODIS (visible and infra red). NNs have also been successfully used to construct long time series of soil moisture from the Advanced Microwave Scanning Radiometer - Earth Observing System (AMSR-E) and SMOS. A NN with input data from ASMR-E observations and SMOS soil moisture as reference for the training was used to construct a dataset sharing a similar climatology and without a significant bias with respect to SMOS soil moisture. Regarding the reference data to train the data-driven retrievals, we will show different possibilities depending on the application. Using actual in situ measurements is challenging at global scale due to the scarce distribution of sensors. In contrast, in situ measurements have been successfully used to retrieve SM at continental scale in North America, where the density of in situ measurement stations is high. Using global land surface models to train the NN constitute an interesting alternative to implement new remote sensing surface datasets. In addition, these datasets can be used to perform data assimilation into the model used as reference for the training. This approach has recently been tested at the European Centre for Medium-Range Weather Forecasts (ECMWF). Finally, retrievals using radiative transfer models can also be used as a reference SM dataset for the training phase. This approach was used to retrieve soil moisture from ASMR-E, as mentioned above, and also to implement the official European Space Agency (ESA) SMOS soil moisture product in Near-Real-Time. We will finish with a discussion of the retrieval of vegetation parameters from SMOS observations using data-driven methods.
A Bayesian approach to model structural error and input variability in groundwater modeling

NASA Astrophysics Data System (ADS)

Xu, T.; Valocchi, A. J.; Lin, Y. F. F.; Liang, F.

2015-12-01

Effective water resource management typically relies on numerical models to analyze groundwater flow and solute transport processes. Model structural error (due to simplification and/or misrepresentation of the "true" environmental system) and input forcing variability (which commonly arises since some inputs are uncontrolled or estimated with high uncertainty) are ubiquitous in groundwater models. Calibration that overlooks errors in model structure and input data can lead to biased parameter estimates and compromised predictions. We present a fully Bayesian approach for a complete assessment of uncertainty for spatially distributed groundwater models. The approach explicitly recognizes stochastic input and uses data-driven error models based on nonparametric kernel methods to account for model structural error. We employ exploratory data analysis to assist in specifying informative prior for error models to improve identifiability. The inference is facilitated by an efficient sampling algorithm based on DREAM-ZS and a parameter subspace multiple-try strategy to reduce the required number of forward simulations of the groundwater model. We demonstrate the Bayesian approach through a synthetic case study of surface-ground water interaction under changing pumping conditions. It is found that explicit treatment of errors in model structure and input data (groundwater pumping rate) has substantial impact on the posterior distribution of groundwater model parameters. Using error models reduces predictive bias caused by parameter compensation. In addition, input variability increases parametric and predictive uncertainty. The Bayesian approach allows for a comparison among the contributions from various error sources, which could inform future model improvement and data collection efforts on how to best direct resources towards reducing predictive uncertainty.
A Multi-layer, Data-driven Advanced Reasoning Tool for Intelligent Data Mining and Analysis for Smart Grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lu, Ning; Du, Pengwei; Greitzer, Frank L.

2012-12-31

This paper presents the multi-layer, data-driven advanced reasoning tool (M-DART), a proof-of-principle decision support tool for improved power system operation. M-DART will cross-correlate and examine different data sources to assess anomalies, infer root causes, and anneal data into actionable information. By performing higher-level reasoning “triage” of diverse data sources, M-DART focuses on early detection of emerging power system events and identifies highest priority actions for the human decision maker. M-DART represents a significant advancement over today’s grid monitoring technologies that apply offline analyses to derive model-based guidelines for online real-time operations and use isolated data processing mechanisms focusing on individualmore » data domains. The development of the M-DART will bridge these gaps by reasoning about results obtained from multiple data sources that are enabled by the smart grid infrastructure. This hybrid approach integrates a knowledge base that is trained offline but tuned online to capture model-based relationships while revealing complex causal relationships among data from different domains.« less

Challenges in Managing Trustworthy Large-scale Digital Science

NASA Astrophysics Data System (ADS)

Evans, B. J. K.

2017-12-01

The increased use of large-scale international digital science has opened a number of challenges for managing, handling, using and preserving scientific information. The large volumes of information are driven by three main categories - model outputs including coupled models and ensembles, data products that have been processing to a level of usability, and increasingly heuristically driven data analysis. These data products are increasingly the ones that are usable by the broad communities, and far in excess of the raw instruments data outputs. The data, software and workflows are then shared and replicated to allow broad use at an international scale, which places further demands of infrastructure to support how the information is managed reliably across distributed resources. Users necessarily rely on these underlying "black boxes" so that they are productive to produce new scientific outcomes. The software for these systems depend on computational infrastructure, software interconnected systems, and information capture systems. This ranges from the fundamentals of the reliability of the compute hardware, system software stacks and libraries, and the model software. Due to these complexities and capacity of the infrastructure, there is an increased emphasis of transparency of the approach and robustness of the methods over the full reproducibility. Furthermore, with large volume data management, it is increasingly difficult to store the historical versions of all model and derived data. Instead, the emphasis is on the ability to access the updated products and the reliability by which both previous outcomes are still relevant and can be updated for the new information. We will discuss these challenges and some of the approaches underway that are being used to address these issues.
Reinforcement Learning with Autonomous Small Unmanned Aerial Vehicles in Cluttered Environments

NASA Technical Reports Server (NTRS)

Tran, Loc; Cross, Charles; Montague, Gilbert; Motter, Mark; Neilan, James; Qualls, Garry; Rothhaar, Paul; Trujillo, Anna; Allen, B. Danette

2015-01-01

We present ongoing work in the Autonomy Incubator at NASA Langley Research Center (LaRC) exploring the efficacy of a data set aggregation approach to reinforcement learning for small unmanned aerial vehicle (sUAV) flight in dense and cluttered environments with reactive obstacle avoidance. The goal is to learn an autonomous flight model using training experiences from a human piloting a sUAV around static obstacles. The training approach uses video data from a forward-facing camera that records the human pilot's flight. Various computer vision based features are extracted from the video relating to edge and gradient information. The recorded human-controlled inputs are used to train an autonomous control model that correlates the extracted feature vector to a yaw command. As part of the reinforcement learning approach, the autonomous control model is iteratively updated with feedback from a human agent who corrects undesired model output. This data driven approach to autonomous obstacle avoidance is explored for simulated forest environments furthering autonomous flight under the tree canopy research. This enables flight in previously inaccessible environments which are of interest to NASA researchers in Earth and Atmospheric sciences.
Model-Driven Theme/UML

NASA Astrophysics Data System (ADS)

Carton, Andrew; Driver, Cormac; Jackson, Andrew; Clarke, Siobhán

Theme/UML is an existing approach to aspect-oriented modelling that supports the modularisation and composition of concerns, including crosscutting ones, in design. To date, its lack of integration with model-driven engineering (MDE) techniques has limited its benefits across the development lifecycle. Here, we describe our work on facilitating the use of Theme/UML as part of an MDE process. We have developed a transformation tool that adopts model-driven architecture (MDA) standards. It defines a concern composition mechanism, implemented as a model transformation, to support the enhanced modularisation features of Theme/UML. We evaluate our approach by applying it to the development of mobile, context-aware applications-an application area characterised by many non-functional requirements that manifest themselves as crosscutting concerns.
Combinatorial and High Throughput Discovery of High Temperature Piezoelectric Ceramics

DTIC Science & Technology

2011-10-10

the known candidate piezoelectric ferroelectric perovskites. Unlike most computational studies on crystal chemistry, where the starting point is some...studies on crystal chemistry, where the starting point is some form of electronic structure calculation, we use a data driven approach to initiate our...experimental measurements reported in the literature. Given that our models are based solely on crystal and electronic structure data and did not
Model Driven Engineering

NASA Astrophysics Data System (ADS)

Gaševic, Dragan; Djuric, Dragan; Devedžic, Vladan

A relevant initiative from the software engineering community called Model Driven Engineering (MDE) is being developed in parallel with the Semantic Web (Mellor et al. 2003a). The MDE approach to software development suggests that one should first develop a model of the system under study, which is then transformed into the real thing (i.e., an executable software entity). The most important research initiative in this area is the Model Driven Architecture (MDA), which is Model Driven Architecture being developed under the umbrella of the Object Management Group (OMG). This chapter describes the basic concepts of this software engineering effort.
Data-Driven Hint Generation from Peer Debugging Solutions

ERIC Educational Resources Information Center

Liu, Zhongxiu

2015-01-01

Data-driven methods have been a successful approach to generating hints for programming problems. However, the majority of previous studies are focused on procedural hints that aim at moving students to the next closest state to the solution. In this paper, I propose a data-driven method to generate remedy hints for BOTS, a game that teaches…
Biomimetics and the case of the remarkable ragworms.

PubMed

Hesselberg, Thomas

2007-08-01

Biomimetics is a rapidly growing field both as an academic and as an applied discipline. This paper gives a short introduction to the current status of the discipline before it describes three approaches to biomimetics: the mechanism-driven, which is based on the study of a specific mechanism; the focused organism-driven, which is based on the study of one function in a model organism; and the integrative organism-driven approach, where multiple functions of a model organism provide inspiration. The first two are established approaches and include many modern studies and the famous biomimetic discoveries of Velcro and the Lotus-Effect, whereas the last approach is not yet well recognized. The advantages of the integrative organism-driven approach are discussed using the ragworms as a case study. A morphological and locomotory study of these marine polychaetes reveals their biomimetic potential, which includes using their ability to move in slippery substrates as inspiration for novel endoscopes, using their compound setae as models for passive friction structures and using their three gaits, slow crawling, fast crawling, and swimming as well as their rapid burrowing technique to provide inspiration for the design of displacement pumps and multifunctional robots.
Biomimetics and the case of the remarkable ragworms

NASA Astrophysics Data System (ADS)

Hesselberg, Thomas

2007-08-01

Biomimetics is a rapidly growing field both as an academic and as an applied discipline. This paper gives a short introduction to the current status of the discipline before it describes three approaches to biomimetics: the mechanism-driven, which is based on the study of a specific mechanism; the focused organism-driven, which is based on the study of one function in a model organism; and the integrative organism-driven approach, where multiple functions of a model organism provide inspiration. The first two are established approaches and include many modern studies and the famous biomimetic discoveries of Velcro and the Lotus-Effect, whereas the last approach is not yet well recognized. The advantages of the integrative organism-driven approach are discussed using the ragworms as a case study. A morphological and locomotory study of these marine polychaetes reveals their biomimetic potential, which includes using their ability to move in slippery substrates as inspiration for novel endoscopes, using their compound setae as models for passive friction structures and using their three gaits, slow crawling, fast crawling, and swimming as well as their rapid burrowing technique to provide inspiration for the design of displacement pumps and multifunctional robots.
Predicting the stochastic guiding of kinesin-driven microtubules in microfabricated tracks: a statistical-mechanics-based modeling approach.

PubMed

Lin, Chih-Tin; Meyhofer, Edgar; Kurabayashi, Katsuo

2010-01-01

Directional control of microtubule shuttles via microfabricated tracks is key to the development of controlled nanoscale mass transport by kinesin motor molecules. Here we develop and test a model to quantitatively predict the stochastic behavior of microtubule guiding when they mechanically collide with the sidewalls of lithographically patterned tracks. By taking into account appropriate probability distributions of microscopic states of the microtubule system, the model allows us to theoretically analyze the roles of collision conditions and kinesin surface densities in determining how the motion of microtubule shuttles is controlled. In addition, we experimentally observe the statistics of microtubule collision events and compare our theoretical prediction with experimental data to validate our model. The model will direct the design of future hybrid nanotechnology devices that integrate nanoscale transport systems powered by kinesin-driven molecular shuttles.
Physics-driven Spatiotemporal Regularization for High-dimensional Predictive Modeling: A Novel Approach to Solve the Inverse ECG Problem

NASA Astrophysics Data System (ADS)

Yao, Bing; Yang, Hui

2016-12-01

This paper presents a novel physics-driven spatiotemporal regularization (STRE) method for high-dimensional predictive modeling in complex healthcare systems. This model not only captures the physics-based interrelationship between time-varying explanatory and response variables that are distributed in the space, but also addresses the spatial and temporal regularizations to improve the prediction performance. The STRE model is implemented to predict the time-varying distribution of electric potentials on the heart surface based on the electrocardiogram (ECG) data from the distributed sensor network placed on the body surface. The model performance is evaluated and validated in both a simulated two-sphere geometry and a realistic torso-heart geometry. Experimental results show that the STRE model significantly outperforms other regularization models that are widely used in current practice such as Tikhonov zero-order, Tikhonov first-order and L1 first-order regularization methods.
Globally-Applicable Predictive Wildfire Model a Temporal-Spatial GIS Based Risk Analysis Using Data Driven Fuzzy Logic Functions

NASA Astrophysics Data System (ADS)

van den Dool, G.

2017-11-01

This study (van den Dool, 2017) is a proof of concept for a global predictive wildfire model, in which the temporal-spatial characteristics of wildfires are placed in a Geographical Information System (GIS), and the risk analysis is based on data-driven fuzzy logic functions. The data sources used in this model are available as global datasets, but subdivided into three pilot areas: North America (California/Nevada), Europe (Spain), and Asia (Mongolia), and are downscaled to the highest resolution (3-arc second). The GIS is constructed around three themes: topography, fuel availability and climate. From the topographical data, six derived sub-themes are created and converted to a fuzzy membership based on the catchment area statistics. The fuel availability score is a composite of four data layers: land cover, wood loads, biomass, biovolumes. As input for the climatological sub-model reanalysed daily averaged, weather-related data is used, which is accumulated to a global weekly time-window (to account for the uncertainty within the climatological model) and forms the temporal component of the model. The final product is a wildfire risk score (from 0 to 1) by week, representing the average wildfire risk in an area. To compute the potential wildfire risk the sub-models are combined usinga Multi-Criteria Approach, and the model results are validated against the area under the Receiver Operating Characteristic curve.
3D finite element model of the diabetic neuropathic foot: a gait analysis driven approach.

PubMed

Guiotto, Annamaria; Sawacha, Zimi; Guarneri, Gabriella; Avogaro, Angelo; Cobelli, Claudio

2014-09-22

Diabetic foot is an invalidating complication of diabetes that can lead to foot ulcers. Three-dimensional (3D) finite element analysis (FEA) allows characterizing the loads developed in the different anatomical structures of the foot in dynamic conditions. The aim of this study was to develop a subject specific 3D foot FE model (FEM) of a diabetic neuropathic (DNS) and a healthy (HS) subject, whose subject specificity can be found in term of foot geometry and boundary conditions. Kinematics, kinetics and plantar pressure (PP) data were extracted from the gait analysis trials of the two subjects with this purpose. The FEM were developed segmenting bones, cartilage and skin from MRI and drawing a horizontal plate as ground support. Materials properties were adopted from previous literature. FE simulations were run with the kinematics and kinetics data of four different phases of the stance phase of gait (heel strike, loading response, midstance and push off). FEMs were then driven by group gait data of 10 neuropathic and 10 healthy subjects. Model validation focused on agreement between FEM-simulated and experimental PP. The peak values and the total distribution of the pressures were compared for this purpose. Results showed that the models were less robust when driven from group data and underestimated the PP in each foot subarea. In particular in the case of the neuropathic subject's model the mean errors between experimental and simulated data were around the 20% of the peak values. This knowledge is crucial in understanding the aetiology of diabetic foot. Copyright © 2014 Elsevier Ltd. All rights reserved.
HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts.

PubMed

Bharath, A; Madhvanath, Sriganesh

2012-04-01

Research for recognizing online handwritten words in Indic scripts is at its early stages when compared to Latin and Oriental scripts. In this paper, we address this problem specifically for two major Indic scripts--Devanagari and Tamil. In contrast to previous approaches, the techniques we propose are largely data driven and script independent. We propose two different techniques for word recognition based on Hidden Markov Models (HMM): lexicon driven and lexicon free. The lexicon-driven technique models each word in the lexicon as a sequence of symbol HMMs according to a standard symbol writing order derived from the phonetic representation. The lexicon-free technique uses a novel Bag-of-Symbols representation of the handwritten word that is independent of symbol order and allows rapid pruning of the lexicon. On handwritten Devanagari word samples featuring both standard and nonstandard symbol writing orders, a combination of lexicon-driven and lexicon-free recognizers significantly outperforms either of them used in isolation. In contrast, most Tamil word samples feature the standard symbol order, and the lexicon-driven recognizer outperforms the lexicon free one as well as their combination. The best recognition accuracies obtained for 20,000 word lexicons are 87.13 percent for Devanagari when the two recognizers are combined, and 91.8 percent for Tamil using the lexicon-driven technique.
Convenience experimentation.

PubMed

Krohs, Ulrich

2012-03-01

Systems biology aims at explaining life processes by means of detailed models of molecular networks, mainly on the whole-cell scale. The whole cell perspective distinguishes the new field of systems biology from earlier approaches within molecular cell biology. The shift was made possible by the high throughput methods that were developed for gathering 'omic' (genomic, proteomic, etc.) data. These new techniques are made commercially available as semi-automatic analytic equipment, ready-made analytic kits and probe arrays. There is a whole industry of supplies for what may be called convenience experimentation. My paper inquires some epistemic consequences of strong reliance on convenience experimentation in systems biology. In times when experimentation was automated to a lesser degree, modeling and in part even experimentation could be understood fairly well as either being driven by hypotheses, and thus proceed by the testing of hypothesis, or as being performed in an exploratory mode, intended to sharpen concepts or initially vague phenomena. In systems biology, the situation is dramatically different. Data collection became so easy (though not cheap) that experimentation is, to a high degree, driven by convenience equipment, and model building is driven by the vast amount of data that is produced by convenience experimentation. This results in a shift in the mode of science. The paper shows that convenience driven science is not primarily hypothesis-testing, nor is it in an exploratory mode. It rather proceeds in a gathering mode. This shift demands another shift in the mode of evaluation, which now becomes an exploratory endeavor, in response to the superabundance of gathered data. Copyright © 2011 Elsevier Ltd. All rights reserved.
Pragmatic pharmacology: population pharmacokinetic analysis of fentanyl using remnant samples from children after cardiac surgery

PubMed Central

Van Driest, Sara L.; Marshall, Matthew D.; Hachey, Brian; Beck, Cole; Crum, Kim; Owen, Jill; Smith, Andrew H.; Kannankeril, Prince J.; Woodworth, Alison; Caprioli, Richard M.

2016-01-01

Aims One barrier contributing to the lack of pharmacokinetic (PK) data in paediatric populations is the need for serial sampling. Analysis of clinically obtained specimens and data may overcome this barrier. To add evidence for the feasibility of this approach, we sought to determine PK parameters for fentanyl in children after cardiac surgery using specimens and data generated in the course of clinical care, without collecting additional blood samples. Methods We measured fentanyl concentrations in plasma from leftover clinically‐obtained specimens in 130 paediatric cardiac surgery patients and successfully generated a PK dataset using drug dosing data extracted from electronic medical records. Using a population PK approach, we estimated PK parameters for this population, assessed model goodness‐of‐fit and internal model validation, and performed subset data analyses. Through simulation studies, we compared predicted fentanyl concentrations using model‐driven weight‐adjusted per kg vs. fixed per kg fentanyl dosing. Results Fentanyl clearance for a 6.4 kg child, the median weight in our cohort, is 5.7 l h–1 (2.2–9.2 l h–1), similar to values found in prior formal PK studies. Model assessment and subset analyses indicated the model adequately fit the data. Of the covariates studied, only weight significantly impacted fentanyl kinetics, but substantial inter‐individual variability remained. In simulation studies, model‐driven weight‐adjusted per kg fentanyl dosing led to more consistent therapeutic fentanyl concentrations than fixed per kg dosing. Conclusions We show here that population PK modelling using sparse remnant samples and electronic medical records data provides a powerful tool for assessment of drug kinetics and generation of individualized dosing regimens. PMID:26861166
A data-driven decomposition approach to model aerodynamic forces on flapping airfoils

NASA Astrophysics Data System (ADS)

Raiola, Marco; Discetti, Stefano; Ianiro, Andrea

2017-11-01

In this work, we exploit a data-driven decomposition of experimental data from a flapping airfoil experiment with the aim of isolating the main contributions to the aerodynamic force and obtaining a phenomenological model. Experiments are carried out on a NACA 0012 airfoil in forward flight with both heaving and pitching motion. Velocity measurements of the near field are carried out with Planar PIV while force measurements are performed with a load cell. The phase-averaged velocity fields are transformed into the wing-fixed reference frame, allowing for a description of the field in a domain with fixed boundaries. The decomposition of the flow field is performed by means of the POD applied on the velocity fluctuations and then extended to the phase-averaged force data by means of the Extended POD approach. This choice is justified by the simple consideration that aerodynamic forces determine the largest contributions to the energetic balance in the flow field. Only the first 6 modes have a relevant contribution to the force. A clear relationship can be drawn between the force and the flow field modes. Moreover, the force modes are closely related (yet slightly different) to the contributions of the classic potential models in literature, allowing for their correction. This work has been supported by the Spanish MINECO under Grant TRA2013-41103-P.
Community College Dual Enrollment Faculty Orientation: A Utilization-Focused Approach

ERIC Educational Resources Information Center

Charlier, Hara D.; Duggan, Molly H.

2010-01-01

The current climate of accountability demands that institutions engage in data-driven program evaluation. In order to promote quality dual enrollment (DE) programs, institutions must support the adjunct faculty teaching college courses in high schools. This study uses Patton's utilization-focused model (1997) to conduct a formative evaluation of a…
Data-Driven Approaches for Paraphrasing across Language Variations

ERIC Educational Resources Information Center

Xu, Wei

2014-01-01

Our language changes very rapidly, accompanying political, social and cultural trends, as well as the evolution of science and technology. The Internet, especially the social media, has accelerated this process of change. This poses a severe challenge for both human beings and natural language processing (NLP) systems, which usually only model a…
Academic Procrastinators, Strategic Delayers and Something Betwixt and Between: An Interview Study

ERIC Educational Resources Information Center

Lindblom-Ylänne, Sari; Saariaho, Emmi; Inkinen, Mikko; Haarala-Muhonen, Anne; Hailikari, Telle

2015-01-01

The study explored university undergraduates' dilatory behaviour, more precisely, procrastination and strategic delaying. Using qualitative interview data, we applied a theory-driven and person-oriented approach to test the theoretical model of Klingsieck (2013). The sample consisted of 28 Bachelor students whose study pace had been slow during…
Developing Knowledge Creating Technical Education Institutions through the Voice of Teachers: Content Analysis Approach

ERIC Educational Resources Information Center

Song, Ji Hoon; Kim, Hye Kyoung; Park, Sunyoung; Bae, Sang Hoon

2014-01-01

The purpose of this study was to develop an empirical data-driven model for a knowledge creation school system in career technical education (CTE) by identifying supportive and hindering factors influencing knowledge creation practices in CTE schools. Nonaka and colleagues' (Nonaka & Konno, 1998; Nonaka & Takeuchi, 1995) knowledge…

A Data-Driven Diagnostic Framework for Wind Turbine Structures: A Holistic Approach

PubMed Central

Bogoevska, Simona; Spiridonakos, Minas; Chatzi, Eleni; Dumova-Jovanoska, Elena; Höffer, Rudiger

2017-01-01

The complex dynamics of operational wind turbine (WT) structures challenges the applicability of existing structural health monitoring (SHM) strategies for condition assessment. At the center of Europe’s renewable energy strategic planning, WT systems call for implementation of strategies that may describe the WT behavior in its complete operational spectrum. The framework proposed in this paper relies on the symbiotic treatment of acting environmental/operational variables and the monitored vibration response of the structure. The approach aims at accurate simulation of the temporal variability characterizing the WT dynamics, and subsequently at the tracking of the evolution of this variability in a longer-term horizon. The bi-component analysis tool is applied on long-term data, collected as part of continuous monitoring campaigns on two actual operating WT structures located in different sites in Germany. The obtained data-driven structural models verify the potential of the proposed strategy for development of an automated SHM diagnostic tool. PMID:28358346
Data-Driven Learning of Speech Acts Based on Corpora of DVD Subtitles

ERIC Educational Resources Information Center

Kitao, S. Kathleen; Kitao, Kenji

2013-01-01

Data-driven learning (DDL) is an inductive approach to language learning in which students study examples of authentic language and use them to find patterns of language use. This inductive approach to learning has the advantages of being learner-centered, encouraging hypothesis testing and learner autonomy, and helping develop learning skills.…
Analyzing the Discourse of Chais Conferences for the Study of Innovation and Learning Technologies via a Data-Driven Approach

ERIC Educational Resources Information Center

Silber-Varod, Vered; Eshet-Alkalai, Yoram; Geri, Nitza

2016-01-01

The current rapid technological changes confront researchers of learning technologies with the challenge of evaluating them, predicting trends, and improving their adoption and diffusion. This study utilizes a data-driven discourse analysis approach, namely culturomics, to investigate changes over time in the research of learning technologies. The…
B-spline parameterization of the dielectric function and information criteria: the craft of non-overfitting

NASA Astrophysics Data System (ADS)

Likhachev, Dmitriy V.

2017-06-01

Johs and Hale developed the Kramers-Kronig consistent B-spline formulation for the dielectric function modeling in spectroscopic ellipsometry data analysis. In this article we use popular Akaike, corrected Akaike and Bayesian Information Criteria (AIC, AICc and BIC, respectively) to determine an optimal number of knots for B-spline model. These criteria allow finding a compromise between under- and overfitting of experimental data since they penalize for increasing number of knots and select representation which achieves the best fit with minimal number of knots. Proposed approach provides objective and practical guidance, as opposite to empirically driven or "gut feeling" decisions, for selecting the right number of knots for B-spline models in spectroscopic ellipsometry. AIC, AICc and BIC selection criteria work remarkably well as we demonstrated in several real-data applications. This approach formalizes selection of the optimal knot number and may be useful in practical perspective of spectroscopic ellipsometry data analysis.
A Fast Surrogate-facilitated Data-driven Bayesian Approach to Uncertainty Quantification of a Regional Groundwater Flow Model with Structural Error

NASA Astrophysics Data System (ADS)

Xu, T.; Valocchi, A. J.; Ye, M.; Liang, F.

2016-12-01

Due to simplification and/or misrepresentation of the real aquifer system, numerical groundwater flow and solute transport models are usually subject to model structural error. During model calibration, the hydrogeological parameters may be overly adjusted to compensate for unknown structural error. This may result in biased predictions when models are used to forecast aquifer response to new forcing. In this study, we extend a fully Bayesian method [Xu and Valocchi, 2015] to calibrate a real-world, regional groundwater flow model. The method uses a data-driven error model to describe model structural error and jointly infers model parameters and structural error. In this study, Bayesian inference is facilitated using high performance computing and fast surrogate models. The surrogate models are constructed using machine learning techniques to emulate the response simulated by the computationally expensive groundwater model. We demonstrate in the real-world case study that explicitly accounting for model structural error yields parameter posterior distributions that are substantially different from those derived by the classical Bayesian calibration that does not account for model structural error. In addition, the Bayesian with error model method gives significantly more accurate prediction along with reasonable credible intervals.
Using field inversion to quantify functional errors in turbulence closures

NASA Astrophysics Data System (ADS)

Singh, Anand Pratap; Duraisamy, Karthik

2016-04-01

A data-informed approach is presented with the objective of quantifying errors and uncertainties in the functional forms of turbulence closure models. The approach creates modeling information from higher-fidelity simulations and experimental data. Specifically, a Bayesian formalism is adopted to infer discrepancies in the source terms of transport equations. A key enabling idea is the transformation of the functional inversion procedure (which is inherently infinite-dimensional) into a finite-dimensional problem in which the distribution of the unknown function is estimated at discrete mesh locations in the computational domain. This allows for the use of an efficient adjoint-driven inversion procedure. The output of the inversion is a full-field of discrepancy that provides hitherto inaccessible modeling information. The utility of the approach is demonstrated by applying it to a number of problems including channel flow, shock-boundary layer interactions, and flows with curvature and separation. In all these cases, the posterior model correlates well with the data. Furthermore, it is shown that even if limited data (such as surface pressures) are used, the accuracy of the inferred solution is improved over the entire computational domain. The results suggest that, by directly addressing the connection between physical data and model discrepancies, the field inversion approach materially enhances the value of computational and experimental data for model improvement. The resulting information can be used by the modeler as a guiding tool to design more accurate model forms, or serve as input to machine learning algorithms to directly replace deficient modeling terms.
Data-driven modeling, control and tools for cyber-physical energy systems

NASA Astrophysics Data System (ADS)

Behl, Madhur

Energy systems are experiencing a gradual but substantial change in moving away from being non-interactive and manually-controlled systems to utilizing tight integration of both cyber (computation, communications, and control) and physical representations guided by first principles based models, at all scales and levels. Furthermore, peak power reduction programs like demand response (DR) are becoming increasingly important as the volatility on the grid continues to increase due to regulation, integration of renewables and extreme weather conditions. In order to shield themselves from the risk of price volatility, end-user electricity consumers must monitor electricity prices and be flexible in the ways they choose to use electricity. This requires the use of control-oriented predictive models of an energy system's dynamics and energy consumption. Such models are needed for understanding and improving the overall energy efficiency and operating costs. However, learning dynamical models using grey/white box approaches is very cost and time prohibitive since it often requires significant financial investments in retrofitting the system with several sensors and hiring domain experts for building the model. We present the use of data-driven methods for making model capture easy and efficient for cyber-physical energy systems. We develop Model-IQ, a methodology for analysis of uncertainty propagation for building inverse modeling and controls. Given a grey-box model structure and real input data from a temporary set of sensors, Model-IQ evaluates the effect of the uncertainty propagation from sensor data to model accuracy and to closed-loop control performance. We also developed a statistical method to quantify the bias in the sensor measurement and to determine near optimal sensor placement and density for accurate data collection for model training and control. Using a real building test-bed, we show how performing an uncertainty analysis can reveal trends about inverse model accuracy and control performance, which can be used to make informed decisions about sensor requirements and data accuracy. We also present DR-Advisor, a data-driven demand response recommender system for the building's facilities manager which provides suitable control actions to meet the desired load curtailment while maintaining operations and maximizing the economic reward. We develop a model based control with regression trees algorithm (mbCRT), which allows us to perform closed-loop control for DR strategy synthesis for large commercial buildings. Our data-driven control synthesis algorithm outperforms rule-based demand response methods for a large DoE commercial reference building and leads to a significant amount of load curtailment (of 380kW) and over $45,000 in savings which is 37.9% of the summer energy bill for the building. The performance of DR-Advisor is also evaluated for 8 buildings on Penn's campus; where it achieves 92.8% to 98.9% prediction accuracy. We also compare DR-Advisor with other data driven methods and rank 2nd on ASHRAE's benchmarking data-set for energy prediction.
Sensor modeling and demonstration of a multi-object spectrometer for performance-driven sensing

NASA Astrophysics Data System (ADS)

Kerekes, John P.; Presnar, Michael D.; Fourspring, Kenneth D.; Ninkov, Zoran; Pogorzala, David R.; Raisanen, Alan D.; Rice, Andrew C.; Vasquez, Juan R.; Patel, Jeffrey P.; MacIntyre, Robert T.; Brown, Scott D.

2009-05-01

A novel multi-object spectrometer (MOS) is being explored for use as an adaptive performance-driven sensor that tracks moving targets. Developed originally for astronomical applications, the instrument utilizes an array of micromirrors to reflect light to a panchromatic imaging array. When an object of interest is detected the individual micromirrors imaging the object are tilted to reflect the light to a spectrometer to collect a full spectrum. This paper will present example sensor performance from empirical data collected in laboratory experiments, as well as our approach in designing optical and radiometric models of the MOS channels and the micromirror array. Simulation of moving vehicles in a highfidelity, hyperspectral scene is used to generate a dynamic video input for the adaptive sensor. Performance-driven algorithms for feature-aided target tracking and modality selection exploit multiple electromagnetic observables to track moving vehicle targets.
Activity-Centered Domain Characterization for Problem-Driven Scientific Visualization

PubMed Central

Marai, G. Elisabeta

2018-01-01

Although visualization design models exist in the literature in the form of higher-level methodological frameworks, these models do not present a clear methodological prescription for the domain characterization step. This work presents a framework and end-to-end model for requirements engineering in problem-driven visualization application design. The framework and model are based on the activity-centered design paradigm, which is an enhancement of human-centered design. The proposed activity-centered approach focuses on user tasks and activities, and allows an explicit link between the requirements engineering process with the abstraction stage—and its evaluation—of existing, higher-level visualization design models. In a departure from existing visualization design models, the resulting model: assigns value to a visualization based on user activities; ranks user tasks before the user data; partitions requirements in activity-related capabilities and nonfunctional characteristics and constraints; and explicitly incorporates the user workflows into the requirements process. A further merit of this model is its explicit integration of functional specifications, a concept this work adapts from the software engineering literature, into the visualization design nested model. A quantitative evaluation using two sets of interdisciplinary projects supports the merits of the activity-centered model. The result is a practical roadmap to the domain characterization step of visualization design for problem-driven data visualization. Following this domain characterization model can help remove a number of pitfalls that have been identified multiple times in the visualization design literature. PMID:28866550
A multi-band environment-adaptive approach to noise suppression for cochlear implants.

PubMed

Saki, Fatemeh; Mirzahasanloo, Taher; Kehtarnavaz, Nasser

2014-01-01

This paper presents an improved environment-adaptive noise suppression solution for the cochlear implants speech processing pipeline. This improvement is achieved by using a multi-band data-driven approach in place of a previously developed single-band data-driven approach. Seven commonly encountered noisy environments of street, car, restaurant, mall, bus, pub and train are considered to quantify the improvement. The results obtained indicate about 10% improvement in speech quality measures.
Extended-Range Prediction with Low-Dimensional, Stochastic-Dynamic Models: A Data-driven Approach

DTIC Science & Technology

2013-09-30

statistically extratropical storms and extremes, and link these to LFV modes. Mingfang Ting, Yochanan Kushnir, Andrew W. Robertson, Lei Wang...forecast models, as well as in the understanding they have generated. Adam Sobel, Daehyun Kim and Shuguang Wang. Extratropical variability and...predictability. Determine the extent to which extratropical monthly and seasonal low-frequency variability (LFV, i.e. PNA, NAO, as well as other regional
A general theory of early growth?. Comment on: "Mathematical models to characterize early epidemic growth: A review" by Gerardo Chowell et al.

NASA Astrophysics Data System (ADS)

House, Thomas

2016-09-01

Chowell et al. [1] consider the early growth behaviour of various epidemic models that range from phenomenological approaches driven by data to mechanistic descriptions of complex interactions between individuals. This is particularly timely given the recent Ebola epidemic, although non-exponential early growth may be more common (but less immediately evident) than we realise.
The Best of All Possible Worlds: Applying the Model Driven Architecture Approach to a JC3IEDM OWL Ontology Modeled in UML

DTIC Science & Technology

2014-06-01

from the ODM standard. Leveraging SPARX EA’s Java application programming interface (API), the team built a tool called OWL2EA that can ingest an OWL...server MySQL creates the physical schema that enables a user to store and retrieve data conforming to the vocabulary of the JC3IEDM. 6. GENERATING AN
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

NASA Astrophysics Data System (ADS)

Galelli, S.; Castelletti, A.

2013-02-01

Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

NASA Astrophysics Data System (ADS)

Galelli, S.; Castelletti, A.

2013-07-01

Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Data-driven non-Markovian closure models

NASA Astrophysics Data System (ADS)

Kondrashov, Dmitri; Chekroun, Mickaël D.; Ghil, Michael

2015-03-01

This paper has two interrelated foci: (i) obtaining stable and efficient data-driven closure models by using a multivariate time series of partial observations from a large-dimensional system; and (ii) comparing these closure models with the optimal closures predicted by the Mori-Zwanzig (MZ) formalism of statistical physics. Multilayer stochastic models (MSMs) are introduced as both a generalization and a time-continuous limit of existing multilevel, regression-based approaches to closure in a data-driven setting; these approaches include empirical model reduction (EMR), as well as more recent multi-layer modeling. It is shown that the multilayer structure of MSMs can provide a natural Markov approximation to the generalized Langevin equation (GLE) of the MZ formalism. A simple correlation-based stopping criterion for an EMR-MSM model is derived to assess how well it approximates the GLE solution. Sufficient conditions are derived on the structure of the nonlinear cross-interactions between the constitutive layers of a given MSM to guarantee the existence of a global random attractor. This existence ensures that no blow-up can occur for a broad class of MSM applications, a class that includes non-polynomial predictors and nonlinearities that do not necessarily preserve quadratic energy invariants. The EMR-MSM methodology is first applied to a conceptual, nonlinear, stochastic climate model of coupled slow and fast variables, in which only slow variables are observed. It is shown that the resulting closure model with energy-conserving nonlinearities efficiently captures the main statistical features of the slow variables, even when there is no formal scale separation and the fast variables are quite energetic. Second, an MSM is shown to successfully reproduce the statistics of a partially observed, generalized Lotka-Volterra model of population dynamics in its chaotic regime. The challenges here include the rarity of strange attractors in the model's parameter space and the existence of multiple attractor basins with fractal boundaries. The positivity constraint on the solutions' components replaces here the quadratic-energy-preserving constraint of fluid-flow problems and it successfully prevents blow-up.
Drought resilience across ecologically dominant species: An experiment-model integration approach

NASA Astrophysics Data System (ADS)

Felton, A. J.; Warren, J.; Ricciuto, D. M.; Smith, M. D.

2017-12-01

Poorly understood are the mechanisms contributing to variability in ecosystem recovery following drought. Grasslands of the central U.S. are ecologically and economically important ecosystems, yet are also highly sensitive to drought. Although characteristics of these ecosystems change across gradients of temperature and precipitation, a consistent feature among these systems is the presence of highly abundant, dominant grass species that control biomass production. As a result, the incorporation of these species' traits into terrestrial biosphere models may constrain predictions amid increases in climatic variability. Here we report the results of a modeling-experiment (MODEX) research approach. We investigated the physiological, morphological and growth responses of the dominant grass species from each of the four major grasslands of the central U.S. (ranging from tallgrass prairie to desert grassland) following severe drought. Despite significant differences in baseline values, full recovery in leaf physiological function was evident across species, of which was consistently driven by the production of new leaves. Further, recovery in whole-plant carbon uptake tended to be driven by shifts in allocation from belowground to aboveground structures. However, there was clear variability among species in the magnitude of this dynamic as well as the relative allocation to stem versus leaf production. As a result, all species harbored the physiological capacity to recover from drought, yet we posit that variability in the recovery of whole-plant carbon uptake to be more strongly driven by variability in the sensitivity of species' morphology to soil moisture increases. The next step of this project will be to incorporate these and other existing data on these species and ecosystems into the community land model in an effort to test the sensitivity of this model to these data.
Weather models as virtual sensors to data-driven rainfall predictions in urban watersheds

NASA Astrophysics Data System (ADS)

Cozzi, Lorenzo; Galelli, Stefano; Pascal, Samuel Jolivet De Marc; Castelletti, Andrea

2013-04-01

Weather and climate predictions are a key element of urban hydrology where they are used to inform water management and assist in flood warning delivering. Indeed, the modelling of the very fast dynamics of urbanized catchments can be substantially improved by the use of weather/rainfall predictions. For example, in Singapore Marina Reservoir catchment runoff processes have a very short time of concentration (roughly one hour) and observational data are thus nearly useless for runoff predictions and weather prediction are required. Unfortunately, radar nowcasting methods do not allow to carrying out long - term weather predictions, whereas numerical models are limited by their coarse spatial scale. Moreover, numerical models are usually poorly reliable because of the fast motion and limited spatial extension of rainfall events. In this study we investigate the combined use of data-driven modelling techniques and weather variables observed/simulated with a numerical model as a way to improve rainfall prediction accuracy and lead time in the Singapore metropolitan area. To explore the feasibility of the approach, we use a Weather Research and Forecast (WRF) model as a virtual sensor network for the input variables (the states of the WRF model) to a machine learning rainfall prediction model. More precisely, we combine an input variable selection method and a non-parametric tree-based model to characterize the empirical relation between the rainfall measured at the catchment level and all possible weather input variables provided by WRF model. We explore different lead time to evaluate the model reliability for different long - term predictions, as well as different time lags to see how past information could improve results. Results show that the proposed approach allow a significant improvement of the prediction accuracy of the WRF model on the Singapore urban area.
Novel Formulation of Adaptive MPC as EKF Using ANN Model: Multiproduct Semibatch Polymerization Reactor Case Study.

PubMed

Kamesh, Reddi; Rani, Kalipatnapu Yamuna

2017-12-01

In this paper, a novel formulation for nonlinear model predictive control (MPC) has been proposed incorporating the extended Kalman filter (EKF) control concept using a purely data-driven artificial neural network (ANN) model based on measurements for supervisory control. The proposed scheme consists of two modules focusing on online parameter estimation based on past measurements and control estimation over control horizon based on minimizing the deviation of model output predictions from set points along the prediction horizon. An industrial case study for temperature control of a multiproduct semibatch polymerization reactor posed as a challenge problem has been considered as a test bed to apply the proposed ANN-EKFMPC strategy at supervisory level as a cascade control configuration along with proportional integral controller [ANN-EKFMPC with PI (ANN-EKFMPC-PI)]. The proposed approach is formulated incorporating all aspects of MPC including move suppression factor for control effort minimization and constraint-handling capability including terminal constraints. The nominal stability analysis and offset-free tracking capabilities of the proposed controller are proved. Its performance is evaluated by comparison with a standard MPC-based cascade control approach using the same adaptive ANN model. The ANN-EKFMPC-PI control configuration has shown better controller performance in terms of temperature tracking, smoother input profiles, as well as constraint-handling ability compared with the ANN-MPC with PI approach for two products in summer and winter. The proposed scheme is found to be versatile although it is based on a purely data-driven model with online parameter estimation.
Combining Domain-driven Design and Mashups for Service Development

NASA Astrophysics Data System (ADS)

Iglesias, Carlos A.; Fernández-Villamor, José Ignacio; Del Pozo, David; Garulli, Luca; García, Boni

This chapter presents the Romulus project approach to Service Development using Java-based web technologies. Romulus aims at improving productivity of service development by providing a tool-supported model to conceive Java-based web applications. This model follows a Domain Driven Design approach, which states that the primary focus of software projects should be the core domain and domain logic. Romulus proposes a tool-supported model, Roma Metaframework, that provides an abstraction layer on top of existing web frameworks and automates the application generation from the domain model. This metaframework follows an object centric approach, and complements Domain Driven Design by identifying the most common cross-cutting concerns (security, service, view, ...) of web applications. The metaframework uses annotations for enriching the domain model with these cross-cutting concerns, so-called aspects. In addition, the chapter presents the usage of mashup technology in the metaframework for service composition, using the web mashup editor MyCocktail. This approach is applied to a scenario of the Mobile Phone Service Portability case study for the development of a new service.

Driven-dissipative quantum Monte Carlo method for open quantum systems

NASA Astrophysics Data System (ADS)

Nagy, Alexandra; Savona, Vincenzo

2018-05-01

We develop a real-time full configuration-interaction quantum Monte Carlo approach to model driven-dissipative open quantum systems with Markovian system-bath coupling. The method enables stochastic sampling of the Liouville-von Neumann time evolution of the density matrix thanks to a massively parallel algorithm, thus providing estimates of observables on the nonequilibrium steady state. We present the underlying theory and introduce an initiator technique and importance sampling to reduce the statistical error. Finally, we demonstrate the efficiency of our approach by applying it to the driven-dissipative two-dimensional X Y Z spin-1/2 model on a lattice.
Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

PubMed Central

Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

2018-01-01

Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie PMID:29688379
Bayesian mixture modeling for blood sugar levels of diabetes mellitus patients (case study in RSUD Saiful Anwar Malang Indonesia)

NASA Astrophysics Data System (ADS)

Budi Astuti, Ani; Iriawan, Nur; Irhamah; Kuswanto, Heri; Sasiarini, Laksmi

2017-10-01

Bayesian statistics proposes an approach that is very flexible in the number of samples and distribution of data. Bayesian Mixture Model (BMM) is a Bayesian approach for multimodal models. Diabetes Mellitus (DM) is more commonly known in the Indonesian community as sweet pee. This disease is one type of chronic non-communicable diseases but it is very dangerous to humans because of the effects of other diseases complications caused. WHO reports in 2013 showed DM disease was ranked 6th in the world as the leading causes of human death. In Indonesia, DM disease continues to increase over time. These research would be studied patterns and would be built the BMM models of the DM data through simulation studies where the simulation data built on cases of blood sugar levels of DM patients in RSUD Saiful Anwar Malang. The results have been successfully demonstrated pattern of distribution of the DM data which has a normal mixture distribution. The BMM models have succeed to accommodate the real condition of the DM data based on the data driven concept.
SEAPODYM-LTL: a parsimonious zooplankton dynamic biomass model

NASA Astrophysics Data System (ADS)

Conchon, Anna; Lehodey, Patrick; Gehlen, Marion; Titaud, Olivier; Senina, Inna; Séférian, Roland

2017-04-01

Mesozooplankton organisms are of critical importance for the understanding of early life history of most fish stocks, as well as the nutrient cycles in the ocean. Ongoing climate change and the need for improved approaches to the management of living marine resources has driven recent advances in zooplankton modelling. The classical modeling approach tends to describe the whole biogeochemical and plankton cycle with increasing complexity. We propose here a different and parsimonious zooplankton dynamic biomass model (SEAPODYM-LTL) that is cost efficient and can be advantageously coupled with primary production estimated either from satellite derived ocean color data or biogeochemical models. In addition, the adjoint code of the model is developed allowing a robust optimization approach for estimating the few parameters of the model. In this study, we run the first optimization experiments using a global database of climatological zooplankton biomass data and we make a comparative analysis to assess the importance of resolution and primary production inputs on model fit to observations. We also compare SEAPODYM-LTL outputs to those produced by a more complex biogeochemical model (PISCES) but sharing the same physical forcings.
Data-Driven Nonlinear Subspace Modeling for Prediction and Control of Molten Iron Quality Indices in Blast Furnace Ironmaking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Ping; Song, Heda; Wang, Hong

Blast furnace (BF) in ironmaking is a nonlinear dynamic process with complicated physical-chemical reactions, where multi-phase and multi-field coupling and large time delay occur during its operation. In BF operation, the molten iron temperature (MIT) as well as Si, P and S contents of molten iron are the most essential molten iron quality (MIQ) indices, whose measurement, modeling and control have always been important issues in metallurgic engineering and automation field. This paper develops a novel data-driven nonlinear state space modeling for the prediction and control of multivariate MIQ indices by integrating hybrid modeling and control techniques. First, to improvemore » modeling efficiency, a data-driven hybrid method combining canonical correlation analysis and correlation analysis is proposed to identify the most influential controllable variables as the modeling inputs from multitudinous factors would affect the MIQ indices. Then, a Hammerstein model for the prediction of MIQ indices is established using the LS-SVM based nonlinear subspace identification method. Such a model is further simplified by using piecewise cubic Hermite interpolating polynomial method to fit the complex nonlinear kernel function. Compared to the original Hammerstein model, this simplified model can not only significantly reduce the computational complexity, but also has almost the same reliability and accuracy for a stable prediction of MIQ indices. Last, in order to verify the practicability of the developed model, it is applied in designing a genetic algorithm based nonlinear predictive controller for multivariate MIQ indices by directly taking the established model as a predictor. Industrial experiments show the advantages and effectiveness of the proposed approach.« less
Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization

PubMed Central

Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

2016-01-01

In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications. PMID:26771830
Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization.

PubMed

Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

2016-01-01

In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications.
Framework for a clinical information system.

PubMed

Van De Velde, R; Lansiers, R; Antonissen, G

2002-01-01

The design and implementation of Clinical Information System architecture is presented. This architecture has been developed and implemented based on components following a strong underlying conceptual and technological model. Common Object Request Broker and n-tier technology featuring centralised and departmental clinical information systems as the back-end store for all clinical data are used. Servers located in the "middle" tier apply the clinical (business) model and application rules. The main characteristics are the focus on modelling and reuse of both data and business logic. Scalability as well as adaptability to constantly changing requirements via component driven computing are the main reasons for that approach.
Search Pathways: Modeling GeoData Search Behavior to Support Usable Application Development

NASA Astrophysics Data System (ADS)

Yarmey, L.; Rosati, A.; Tressel, S.

2014-12-01

Recent technical advances have enabled development of new scientific data discovery systems. Metadata brokering, linked data, and other mechanisms allow users to discover scientific data of interes across growing volumes of heterogeneous content. Matching this complex content with existing discovery technologies, people looking for scientific data are presented with an ever-growing array of features to sort, filter, subset, and scan through search returns to help them find what they are looking for. This paper examines the applicability of available technologies in connecting searchers with the data of interest. What metrics can be used to track success given shifting baselines of content and technology? How well do existing technologies map to steps in user search patterns? Taking a user-driven development approach, the team behind the Arctic Data Explorer interdisciplinary data discovery application invested heavily in usability testing and user search behavior analysis. Building on earlier library community search behavior work, models were developed to better define the diverse set of thought processes and steps users took to find data of interest, here called 'search pathways'. This research builds a deeper understanding of the user community that seeks to reuse scientific data. This approach ensures that development decisions are driven by clearly articulated user needs instead of ad hoc technology trends. Initial results from this research will be presented along with lessons learned for other discovery platform development and future directions for informatics research into search pathways.
The role of data assimilation in maximizing the utility of geospace observations (Invited)

NASA Astrophysics Data System (ADS)

Matsuo, T.

2013-12-01

Data assimilation can facilitate maximizing the utility of existing geospace observations by offering an ultimate marriage of inductive (data-driven) and deductive (first-principles based) approaches to addressing critical questions in space weather. Assimilative approaches that incorporate dynamical models are, in particular, capable of making a diverse set of observations consistent with physical processes included in a first-principles model, and allowing unobserved physical states to be inferred from observations. These points will be demonstrated in the context of the application of an ensemble Kalman filter (EnKF) to a thermosphere and ionosphere general circulation model. An important attribute of this approach is that the feedback between plasma and neutral variables is self-consistently treated both in the forecast model as well as in the assimilation scheme. This takes advantage of the intimate coupling between the thermosphere and ionosphere described in general circulation models to enable the inference of unobserved thermospheric states from the relatively plentiful observations of the ionosphere. Given the ever-growing infrastructure for the global navigation satellite system, this is indeed a promising prospect for geospace data assimilation. In principle, similar approaches can be applied to any geospace observing systems to extract more geophysical information from a given set of observations than would otherwise be possible.
Using Data-Driven Model-Brain Mappings to Constrain Formal Models of Cognition

PubMed Central

Borst, Jelmer P.; Nijboer, Menno; Taatgen, Niels A.; van Rijn, Hedderik; Anderson, John R.

2015-01-01

In this paper we propose a method to create data-driven mappings from components of cognitive models to brain regions. Cognitive models are notoriously hard to evaluate, especially based on behavioral measures alone. Neuroimaging data can provide additional constraints, but this requires a mapping from model components to brain regions. Although such mappings can be based on the experience of the modeler or on a reading of the literature, a formal method is preferred to prevent researcher-based biases. In this paper we used model-based fMRI analysis to create a data-driven model-brain mapping for five modules of the ACT-R cognitive architecture. We then validated this mapping by applying it to two new datasets with associated models. The new mapping was at least as powerful as an existing mapping that was based on the literature, and indicated where the models were supported by the data and where they have to be improved. We conclude that data-driven model-brain mappings can provide strong constraints on cognitive models, and that model-based fMRI is a suitable way to create such mappings. PMID:25747601
Livestock Helminths in a Changing Climate: Approaches and Restrictions to Meaningful Predictions.

PubMed

Fox, Naomi J; Marion, Glenn; Davidson, Ross S; White, Piran C L; Hutchings, Michael R

2012-03-06

Climate change is a driving force for livestock parasite risk. This is especially true for helminths including the nematodes Haemonchus contortus, Teladorsagia circumcincta, Nematodirus battus, and the trematode Fasciola hepatica, since survival and development of free-living stages is chiefly affected by temperature and moisture. The paucity of long term predictions of helminth risk under climate change has driven us to explore optimal modelling approaches and identify current bottlenecks to generating meaningful predictions. We classify approaches as correlative or mechanistic, exploring their strengths and limitations. Climate is one aspect of a complex system and, at the farm level, husbandry has a dominant influence on helminth transmission. Continuing environmental change will necessitate the adoption of mitigation and adaptation strategies in husbandry. Long term predictive models need to have the architecture to incorporate these changes. Ultimately, an optimal modelling approach is likely to combine mechanistic processes and physiological thresholds with correlative bioclimatic modelling, incorporating changes in livestock husbandry and disease control. Irrespective of approach, the principal limitation to parasite predictions is the availability of active surveillance data and empirical data on physiological responses to climate variables. By combining improved empirical data and refined models with a broad view of the livestock system, robust projections of helminth risk can be developed.
Comparison of the performance of tracer kinetic model-driven registration for dynamic contrast enhanced MRI using different models of contrast enhancement.

PubMed

Buonaccorsi, Giovanni A; Roberts, Caleb; Cheung, Sue; Watson, Yvonne; O'Connor, James P B; Davies, Karen; Jackson, Alan; Jayson, Gordon C; Parker, Geoff J M

2006-09-01

The quantitative analysis of dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) data is subject to model fitting errors caused by motion during the time-series data acquisition. However, the time-varying features that occur as a result of contrast enhancement can confound motion correction techniques based on conventional registration similarity measures. We have therefore developed a heuristic, locally controlled tracer kinetic model-driven registration procedure, in which the model accounts for contrast enhancement, and applied it to the registration of abdominal DCE-MRI data at high temporal resolution. Using severely motion-corrupted data sets that had been excluded from analysis in a clinical trial of an antiangiogenic agent, we compared the results obtained when using different models to drive the tracer kinetic model-driven registration with those obtained when using a conventional registration against the time series mean image volume. Using tracer kinetic model-driven registration, it was possible to improve model fitting by reducing the sum of squared errors but the improvement was only realized when using a model that adequately described the features of the time series data. The registration against the time series mean significantly distorted the time series data, as did tracer kinetic model-driven registration using a simpler model of contrast enhancement. When an appropriate model is used, tracer kinetic model-driven registration influences motion-corrupted model fit parameter estimates and provides significant improvements in localization in three-dimensional parameter maps. This has positive implications for the use of quantitative DCE-MRI for example in clinical trials of antiangiogenic or antivascular agents.
Solar wind driven empirical forecast models of the time derivative of the ground magnetic field

NASA Astrophysics Data System (ADS)

Wintoft, Peter; Wik, Magnus; Viljanen, Ari

2015-03-01

Empirical models are developed to provide 10-30-min forecasts of the magnitude of the time derivative of local horizontal ground geomagnetic field (|dBh/dt|) over Europe. The models are driven by ACE solar wind data. A major part of the work has been devoted to the search and selection of datasets to support the model development. To simplify the problem, but at the same time capture sudden changes, 30-min maximum values of |dBh/dt| are forecast with a cadence of 1 min. Models are tested both with and without the use of ACE SWEPAM plasma data. It is shown that the models generally capture sudden increases in |dBh/dt| that are associated with sudden impulses (SI). The SI is the dominant disturbance source for geomagnetic latitudes below 50° N and with minor contribution from substorms. However, at occasions, large disturbances can be seen associated with geomagnetic pulsations. For higher latitudes longer lasting disturbances, associated with substorms, are generally also captured. It is also shown that the models using only solar wind magnetic field as input perform in most cases equally well as models with plasma data. The models have been verified using different approaches including the extremal dependence index which is suitable for rare events.
Consistent data-driven computational mechanics

NASA Astrophysics Data System (ADS)

González, D.; Chinesta, F.; Cueto, E.

2018-05-01

We present a novel method, within the realm of data-driven computational mechanics, to obtain reliable and thermodynamically sound simulation from experimental data. We thus avoid the need to fit any phenomenological model in the construction of the simulation model. This kind of techniques opens unprecedented possibilities in the framework of data-driven application systems and, particularly, in the paradigm of industry 4.0.
Simulating mesoscale coastal evolution for decadal coastal management: A new framework integrating multiple, complementary modelling approaches

NASA Astrophysics Data System (ADS)

van Maanen, Barend; Nicholls, Robert J.; French, Jon R.; Barkwith, Andrew; Bonaldo, Davide; Burningham, Helene; Brad Murray, A.; Payo, Andres; Sutherland, James; Thornhill, Gillian; Townend, Ian H.; van der Wegen, Mick; Walkden, Mike J. A.

2016-03-01

Coastal and shoreline management increasingly needs to consider morphological change occurring at decadal to centennial timescales, especially that related to climate change and sea-level rise. This requires the development of morphological models operating at a mesoscale, defined by time and length scales of the order 101 to 102 years and 101 to 102 km. So-called 'reduced complexity' models that represent critical processes at scales not much smaller than the primary scale of interest, and are regulated by capturing the critical feedbacks that govern landform behaviour, are proving effective as a means of exploring emergent coastal behaviour at a landscape scale. Such models tend to be computationally efficient and are thus easily applied within a probabilistic framework. At the same time, reductionist models, built upon a more detailed description of hydrodynamic and sediment transport processes, are capable of application at increasingly broad spatial and temporal scales. More qualitative modelling approaches are also emerging that can guide the development and deployment of quantitative models, and these can be supplemented by varied data-driven modelling approaches that can achieve new explanatory insights from observational datasets. Such disparate approaches have hitherto been pursued largely in isolation by mutually exclusive modelling communities. Brought together, they have the potential to facilitate a step change in our ability to simulate the evolution of coastal morphology at scales that are most relevant to managing erosion and flood risk. Here, we advocate and outline a new integrated modelling framework that deploys coupled mesoscale reduced complexity models, reductionist coastal area models, data-driven approaches, and qualitative conceptual models. Integration of these heterogeneous approaches gives rise to model compositions that can potentially resolve decadal- to centennial-scale behaviour of diverse coupled open coast, estuary and inner shelf settings. This vision is illustrated through an idealised composition of models for a ~ 70 km stretch of the Suffolk coast, eastern England. A key advantage of model linking is that it allows a wide range of real-world situations to be simulated from a small set of model components. However, this process involves more than just the development of software that allows for flexible model coupling. The compatibility of radically different modelling assumptions remains to be carefully assessed and testing as well as evaluating uncertainties of models in composition are areas that require further attention.
Universal fragment descriptors for predicting properties of inorganic crystals

NASA Astrophysics Data System (ADS)

Isayev, Olexandr; Oses, Corey; Toher, Cormac; Gossett, Eric; Curtarolo, Stefano; Tropsha, Alexander

2017-06-01

Although historically materials discovery has been driven by a laborious trial-and-error process, knowledge-driven materials design can now be enabled by the rational combination of Machine Learning methods and materials databases. Here, data from the AFLOW repository for ab initio calculations is combined with Quantitative Materials Structure-Property Relationship models to predict important properties: metal/insulator classification, band gap energy, bulk/shear moduli, Debye temperature and heat capacities. The prediction's accuracy compares well with the quality of the training data for virtually any stoichiometric inorganic crystalline material, reciprocating the available thermomechanical experimental data. The universality of the approach is attributed to the construction of the descriptors: Property-Labelled Materials Fragments. The representations require only minimal structural input allowing straightforward implementations of simple heuristic design rules.
Universal fragment descriptors for predicting properties of inorganic crystals.

PubMed

Isayev, Olexandr; Oses, Corey; Toher, Cormac; Gossett, Eric; Curtarolo, Stefano; Tropsha, Alexander

2017-06-05

Although historically materials discovery has been driven by a laborious trial-and-error process, knowledge-driven materials design can now be enabled by the rational combination of Machine Learning methods and materials databases. Here, data from the AFLOW repository for ab initio calculations is combined with Quantitative Materials Structure-Property Relationship models to predict important properties: metal/insulator classification, band gap energy, bulk/shear moduli, Debye temperature and heat capacities. The prediction's accuracy compares well with the quality of the training data for virtually any stoichiometric inorganic crystalline material, reciprocating the available thermomechanical experimental data. The universality of the approach is attributed to the construction of the descriptors: Property-Labelled Materials Fragments. The representations require only minimal structural input allowing straightforward implementations of simple heuristic design rules.
A Predictive Approach to Network Reverse-Engineering

NASA Astrophysics Data System (ADS)

Wiggins, Chris

2005-03-01

A central challenge of systems biology is the ``reverse engineering" of transcriptional networks: inferring which genes exert regulatory control over which other genes. Attempting such inference at the genomic scale has only recently become feasible, via data-intensive biological innovations such as DNA microrrays (``DNA chips") and the sequencing of whole genomes. In this talk we present a predictive approach to network reverse-engineering, in which we integrate DNA chip data and sequence data to build a model of the transcriptional network of the yeast S. cerevisiae capable of predicting the response of genes in unseen experiments. The technique can also be used to extract ``motifs,'' sequence elements which act as binding sites for regulatory proteins. We validate by a number of approaches and present comparison of theoretical prediction vs. experimental data, along with biological interpretations of the resulting model. En route, we will illustrate some basic notions in statistical learning theory (fitting vs. over-fitting; cross- validation; assessing statistical significance), highlighting ways in which physicists can make a unique contribution in data- driven approaches to reverse engineering.
Integrative Systems Biology for Data Driven Knowledge Discovery

PubMed Central

Greene, Casey S.; Troyanskaya, Olga G.

2015-01-01

Integrative systems biology is an approach that brings together diverse high throughput experiments and databases to gain new insights into biological processes or systems at molecular through physiological levels. These approaches rely on diverse high-throughput experimental techniques that generate heterogeneous data by assaying varying aspects of complex biological processes. Computational approaches are necessary to provide an integrative view of these experimental results and enable data-driven knowledge discovery. Hypotheses generated from these approaches can direct definitive molecular experiments in a cost effective manner. Using integrative systems biology approaches, we can leverage existing biological knowledge and large-scale data to improve our understanding of yet unknown components of a system of interest and how its malfunction leads to disease. PMID:21044756

Satellite detection of land-use change and effects on regional forest aboveground biomass estimates

Treesearch

Daolan Zheng; Linda S. Heath; Mark J. Ducey

2008-01-01

We used remote-sensing-driven models to detect land-cover change effects on forest aboveground biomass (AGB) density (Mg·ha−1, dry weight) and total AGB (Tg) in Minnesota, Wisconsin, and Michigan USA, between the years 1992-2001, and conducted an evaluation of the approach. Inputs included remotely-sensed 1992 reflectance data...
On the data-driven inference of modulatory networks in climate science: an application to West African rainfall

NASA Astrophysics Data System (ADS)

González, D. L., II; Angus, M. P.; Tetteh, I. K.; Bello, G. A.; Padmanabhan, K.; Pendse, S. V.; Srinivas, S.; Yu, J.; Semazzi, F.; Kumar, V.; Samatova, N. F.

2015-01-01

Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall~variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression, and dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall. These relationships fall into two categories: well-known associations from prior climate knowledge, such as the relationship with the El Niño-Southern Oscillation (ENSO) and putative links, such as North Atlantic Oscillation, that invite further research.
On the data-driven inference of modulatory networks in climate science: An application to West African rainfall

DOE PAGES

Gonzalez, II, D. L.; Angus, M. P.; Tetteh, I. K.; ...

2015-01-13

Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall~variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression,more » and dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall. As a result, these relationships fall into two categories: well-known associations from prior climate knowledge, such as the relationship with the El Niño–Southern Oscillation (ENSO) and putative links, such as North Atlantic Oscillation, that invite further research.« less
Data warehousing methods and processing infrastructure for brain recovery research.

PubMed

Gee, T; Kenny, S; Price, C J; Seghier, M L; Small, S L; Leff, A P; Pacurar, A; Strother, S C

2010-09-01

In order to accelerate translational neuroscience with the goal of improving clinical care it has become important to support rapid accumulation and analysis of large, heterogeneous neuroimaging samples and their metadata from both normal control and patient groups. We propose a multi-centre, multinational approach to accelerate the data mining of large samples and facilitate data-led clinical translation of neuroimaging results in stroke. Such data-driven approaches are likely to have an early impact on clinically relevant brain recovery while we simultaneously pursue the much more challenging model-based approaches that depend on a deep understanding of the complex neural circuitry and physiological processes that support brain function and recovery. We present a brief overview of three (potentially converging) approaches to neuroimaging data warehousing and processing that aim to support these diverse methods for facilitating prediction of cognitive and behavioral recovery after stroke, or other types of brain injury or disease.
Data-driven and hybrid coastal morphological prediction methods for mesoscale forecasting

NASA Astrophysics Data System (ADS)

Reeve, Dominic E.; Karunarathna, Harshinie; Pan, Shunqi; Horrillo-Caraballo, Jose M.; Różyński, Grzegorz; Ranasinghe, Roshanka

2016-03-01

It is now common for coastal planning to anticipate changes anywhere from 70 to 100 years into the future. The process models developed and used for scheme design or for large-scale oceanography are currently inadequate for this task. This has prompted the development of a plethora of alternative methods. Some, such as reduced complexity or hybrid models simplify the governing equations retaining processes that are considered to govern observed morphological behaviour. The computational cost of these models is low and they have proven effective in exploring morphodynamic trends and improving our understanding of mesoscale behaviour. One drawback is that there is no generally agreed set of principles on which to make the simplifying assumptions and predictions can vary considerably between models. An alternative approach is data-driven techniques that are based entirely on analysis and extrapolation of observations. Here, we discuss the application of some of the better known and emerging methods in this category to argue that with the increasing availability of observations from coastal monitoring programmes and the development of more sophisticated statistical analysis techniques data-driven models provide a valuable addition to the armoury of methods available for mesoscale prediction. The continuation of established monitoring programmes is paramount, and those that provide contemporaneous records of the driving forces and the shoreline response are the most valuable in this regard. In the second part of the paper we discuss some recent research that combining some of the hybrid techniques with data analysis methods in order to synthesise a more consistent means of predicting mesoscale coastal morphological evolution. While encouraging in certain applications a universally applicable approach has yet to be found. The route to linking different model types is highlighted as a major challenge and requires further research to establish its viability. We argue that key elements of a successful solution will need to account for dependencies between driving parameters, (such as wave height and tide level), and be able to predict step changes in the configuration of coastal systems.
Individualized Prediction of Heat Stress in Firefighters: A Data-Driven Approach Using Classification and Regression Trees.

PubMed

Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit

2015-01-01

The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.
Modelling irradiation-induced softening in BCC iron by crystal plasticity approach

NASA Astrophysics Data System (ADS)

Xiao, Xiazi; Terentyev, Dmitry; Yu, Long; Song, Dingkun; Bakaev, A.; Duan, Huiling

2015-11-01

Crystal plasticity model (CPM) for BCC iron to account for radiation-induced strain softening is proposed. CPM is based on the plastically-driven and thermally-activated removal of dislocation loops. Atomistic simulations are applied to parameterize dislocation-defect interactions. Combining experimental microstructures, defect-hardening/absorption rules from atomistic simulations, and CPM fitted to properties of non-irradiated iron, the model achieves a good agreement with experimental data regarding radiation-induced strain softening and flow stress increase under neutron irradiation.
Simulating Flying Insects Using Dynamics and Data-Driven Noise Modeling to Generate Diverse Collective Behaviors

PubMed Central

Ren, Jiaping; Wang, Xinjie; Manocha, Dinesh

2016-01-01

We present a biologically plausible dynamics model to simulate swarms of flying insects. Our formulation, which is based on biological conclusions and experimental observations, is designed to simulate large insect swarms of varying densities. We use a force-based model that captures different interactions between the insects and the environment and computes collision-free trajectories for each individual insect. Furthermore, we model the noise as a constructive force at the collective level and present a technique to generate noise-induced insect movements in a large swarm that are similar to those observed in real-world trajectories. We use a data-driven formulation that is based on pre-recorded insect trajectories. We also present a novel evaluation metric and a statistical validation approach that takes into account various characteristics of insect motions. In practice, the combination of Curl noise function with our dynamics model is used to generate realistic swarm simulations and emergent behaviors. We highlight its performance for simulating large flying swarms of midges, fruit fly, locusts and moths and demonstrate many collective behaviors, including aggregation, migration, phase transition, and escape responses. PMID:27187068
A model for water discharge based on energy consumption data (WATEN).

NASA Astrophysics Data System (ADS)

Moyano, María Carmen; Tornos, Lucía; Juana, Luis

2014-05-01

As the need for water conservation is becoming a major water concern, a lumped model entitled WATEN has been proposed to analyse the water balance in the B-XII Irrigation Sector of the Lower Guadalquivir Irrigated Area, one of the largest irrigated areas in Spain. The aim of this work is to approach the hydrological study of an irrigation district lacking of robust data in such a manner that the water balance is performed from less to more process complexity. WATEN parameters are the total and readily available moisture in the soil, a fix percentage for effective precipitation, and the irrigation efficiency. The Sector presents six different drainage pumping stations, with particular pumping groups and with no water flow measurement devices. Energy consumption depends on the working pumping stations and groups, and on the variable water level to discharge. Energy consumed in the drainage pumping stations has been used for calibration The study has relied on two monthly series of data: the volume of drainage obtained from the model and the energy consumed in the pumping stations. A double mass analysis has permitted the detection of data tendencies. The two resulting series of data have been compared to assess model performance, particularly the Pearson's product moment correlation coefficient and the Nash-Sutcliffe coefficient of efficiency, e2, determined for monthly data and for annual and monthly average data. For model calibration, we have followed a classical approach based on objective functions optimization, and a robust approach based on Markov chain Monte Carlo simulation process, driven in a similar manner to genetic algorithms, entitled Parameters Estimation on Driven Trials (PEDT), and aiming to reduce computational requirements. WATEN has been parameterised maintaining its physical and conceptual rationality. The study approach is outlined as a progressive introduction of data. In this manner, we can observe its effect on the studied objective functions, and visualize if new data adds significant improvements to model results. The model attained an average Nash-Sutcliffe coefficient e2~= 0.90 between based on energy drainage observations and estimated drainage discharge. The study has shown that the Sector crop evapotranspiration, is lower than the expected value in pristine conditions. This reduction would be more noticeable at the end of the summer months, attaining as far as a 40% reduction. Average drainage in the studied period, is about 3700 m3/ha/year. This methodology is thought to be the basis for similar worldwide studies comprising scarce-data irrigation districts with drainage discharge to receiving water bodies, and serve as a guide for future alike applications.
A co-clinical approach identifies mechanisms and potential therapies for androgen deprivation resistance in prostate cancer

PubMed Central

Lunardi, Andrea; Ala, Ugo; Epping, Mirjam T.; Salmena, Leonardo; Clohessy, John G.; Webster, Kaitlyn A.; Wang, Guocan; Mazzucchelli, Roberta; Bianconi, Maristella; Stack, Edward C.; Lis, Rosina; Patnaik, Akash; Cantley, Lewis C.; Bubley, Glenn; Cordon-Cardo, Carlos; Gerald, William L.; Montironi, Rodolfo; Signoretti, Sabina; Loda, Massimo; Nardella, Caterina; Pandolfi, Pier Paolo

2013-01-01

Here we report an integrated analysis that leverages data from treatment of genetic mouse models of prostate cancer along with clinical data from patients to elucidate new mechanisms of castration resistance. We show that castration counteracts tumor progression in a Pten-loss driven mouse model of prostate cancer through the induction of apoptosis and proliferation block. Conversely, this response is bypassed upon deletion of either Trp53 or Lrf together with Pten, leading to the development of castration resistant prostate cancer (CRPC). Mechanistically, the integrated acquisition of data from mouse models and patients identifies the expression patterns of XAF1-XIAP/SRD5A1 as a predictive and actionable signature for CRPC. Importantly, we show that combined inhibition of XIAP, SRD5A1, and AR pathways overcomes castration resistance. Thus, our co-clinical approach facilitates stratification of patients and the development of tailored and innovative therapeutic treatments. PMID:23727860
Strategies for Early Outbreak Detection of Malaria in the Amhara Region of Ethiopia

NASA Astrophysics Data System (ADS)

Nekorchuk, D.; Gebrehiwot, T.; Mihretie, A.; Awoke, W.; Wimberly, M. C.

2017-12-01

Traditional epidemiological approaches to early detection of disease outbreaks are based on relatively straightforward thresholds (e.g. 75th percentile, standard deviations) estimated from historical case data. For diseases with strong seasonality, these can be modified to create separate thresholds for each seasonal time step. However, for disease processes that are non-stationary, more sophisticated techniques are needed to more accurately estimate outbreak threshold values. Early detection for geohealth-related diseases that also have environmental drivers, such as vector-borne diseases, may also benefit from the integration of time-lagged environmental data and disease ecology models into the threshold calculations. The Epidemic Prognosis Incorporating Disease and Environmental Monitoring for Integrated Assessment (EPIDEMIA) project has been integrating malaria case surveillance with remotely-sensed environmental data for early detection, warning, and forecasting of malaria epidemics in the Amhara region of Ethiopia, and has five years of weekly time series data from 47 woredas (districts). Efforts to reduce the burden of malaria in Ethiopia has been met with some notable success in the past two decades with major reduction in cases and deaths. However, malaria remains a significant public health threat as 60% of the population live in malarious areas, and due to the seasonal and unstable transmission patterns with cyclic outbreaks, protective immunity is generally low which could cause high morbidity and mortality during the epidemics. This study compared several approaches for defining outbreak thresholds and for identifying a potential outbreak based on deviations from these thresholds. We found that model-based approaches that accounted for climate-driven seasonality in malaria transmission were most effective, and that incorporating a trend component improved outbreak detection in areas with active malaria elimination efforts. An advantage of these early detection techniques is that they can detect climate-driven outbreaks as well as outbreaks driven by social factors such as human migration.
EXPLORING DATA-DRIVEN SPECTRAL MODELS FOR APOGEE M DWARFS

NASA Astrophysics Data System (ADS)

Lua Birky, Jessica; Hogg, David; Burgasser, Adam J.; Jessica Birky

2018-01-01

The Cannon (Ness et al. 2015; Casey et al. 2016) is a flexible, data-driven spectral modeling and parameter inference framework, demonstrated on high-resolution Apache Point Galactic Evolution Experiment (APOGEE; λ/Δλ~22,500, 1.5-1.7µm) spectra of giant stars to estimate stellar labels (Teff, logg, [Fe/H], and chemical abundances) to precisions higher than the model-grid pipeline. The lack of reliable stellar parameters reported by the APOGEE pipeline for temperatures less than ~3550K, motivates extension of this approach to M dwarf stars. Using a training set of 51 M dwarfs with spectral types ranging M0-M9 obtained from SDSS optical spectra, we demonstrate that the Cannon can infer spectral types to a precision of +/-0.6 types, making it an effective tool for classifying high-resolution near-infrared spectra. We discuss the potential for extending this work to determine the physical stellar labels Teff, logg, and [Fe/H].This work is supported by the SDSS Faculty and Student (FAST) initiative.
A Data-Driven Response Virtual Sensor Technique with Partial Vibration Measurements Using Convolutional Neural Network.

PubMed

Sun, Shan-Bin; He, Yuan-Yuan; Zhou, Si-Da; Yue, Zhen-Jiang

2017-12-12

Measurement of dynamic responses plays an important role in structural health monitoring, damage detection and other fields of research. However, in aerospace engineering, the physical sensors are limited in the operational conditions of spacecraft, due to the severe environment in outer space. This paper proposes a virtual sensor model with partial vibration measurements using a convolutional neural network. The transmissibility function is employed as prior knowledge. A four-layer neural network with two convolutional layers, one fully connected layer, and an output layer is proposed as the predicting model. Numerical examples of two different structural dynamic systems demonstrate the performance of the proposed approach. The excellence of the novel technique is further indicated using a simply supported beam experiment comparing to a modal-model-based virtual sensor, which uses modal parameters, such as mode shapes, for estimating the responses of the faulty sensors. The results show that the presented data-driven response virtual sensor technique can predict structural response with high accuracy.
A Data-Driven Response Virtual Sensor Technique with Partial Vibration Measurements Using Convolutional Neural Network

PubMed Central

Sun, Shan-Bin; He, Yuan-Yuan; Zhou, Si-Da; Yue, Zhen-Jiang

2017-01-01

Measurement of dynamic responses plays an important role in structural health monitoring, damage detection and other fields of research. However, in aerospace engineering, the physical sensors are limited in the operational conditions of spacecraft, due to the severe environment in outer space. This paper proposes a virtual sensor model with partial vibration measurements using a convolutional neural network. The transmissibility function is employed as prior knowledge. A four-layer neural network with two convolutional layers, one fully connected layer, and an output layer is proposed as the predicting model. Numerical examples of two different structural dynamic systems demonstrate the performance of the proposed approach. The excellence of the novel technique is further indicated using a simply supported beam experiment comparing to a modal-model-based virtual sensor, which uses modal parameters, such as mode shapes, for estimating the responses of the faulty sensors. The results show that the presented data-driven response virtual sensor technique can predict structural response with high accuracy. PMID:29231868
Learning from Learners: A Non-Standard Direct Approach to the Teaching of Writing Skills in EFL in a University Context

ERIC Educational Resources Information Center

Fuster-Márquez, Miguel; Gregori-Signes, Carmen

2018-01-01

Corpora have been used in English as a foreign language materials for decades, and native corpora have been present in the classroom by means of direct approaches such as Data-Driven Learning (Johns, T., and P. King 1991. "'Should you be Persuaded'- Two Samples of Data-Driven Learning Materials." In "Classroom Concordancing,"…
A Model-Driven Approach for Telecommunications Network Services Definition

NASA Astrophysics Data System (ADS)

Chiprianov, Vanea; Kermarrec, Yvon; Alff, Patrick D.

Present day Telecommunications market imposes a short concept-to-market time for service providers. To reduce it, we propose a computer-aided, model-driven, service-specific tool, with support for collaborative work and for checking properties on models. We started by defining a prototype of the Meta-model (MM) of the service domain. Using this prototype, we defined a simple graphical modeling language specific for service designers. We are currently enlarging the MM of the domain using model transformations from Network Abstractions Layers (NALs). In the future, we will investigate approaches to ensure the support for collaborative work and for checking properties on models.
Economic analysis of open space box model utilization in spacecraft

NASA Astrophysics Data System (ADS)

Mohammad, Atif F.; Straub, Jeremy

2015-05-01

It is a known fact that the amount of data about space that is stored is getting larger on an everyday basis. However, the utilization of Big Data and related tools to perform ETL (Extract, Transform and Load) applications will soon be pervasive in the space sciences. We have entered in a crucial time where using Big Data can be the difference (for terrestrial applications) between organizations underperforming and outperforming their peers. The same is true for NASA and other space agencies, as well as for individual missions and the highly-competitive process of mission data analysis and publication. In most industries, conventional opponents and new candidates alike will influence data-driven approaches to revolutionize and capture the value of Big Data archives. The Open Space Box Model is poised to take the proverbial "giant leap", as it provides autonomic data processing and communications for spacecraft. We can find economic value generated from such use of data processing in our earthly organizations in every sector, such as healthcare, retail. We also can easily find retailers, performing research on Big Data, by utilizing sensors driven embedded data in products within their stores and warehouses to determine how these products are actually used in the real world.
Modeling soil evaporation efficiency in a range of soil and atmospheric conditions using a meta-analysis approach

NASA Astrophysics Data System (ADS)

Merlin, O.; Stefan, V. G.; Amazirh, A.; Chanzy, A.; Ceschia, E.; Er-Raki, S.; Gentine, P.; Tallec, T.; Ezzahar, J.; Bircher, S.; Beringer, J.; Khabba, S.

2016-05-01

A meta-analysis data-driven approach is developed to represent the soil evaporative efficiency (SEE) defined as the ratio of actual to potential soil evaporation. The new model is tested across a bare soil database composed of more than 30 sites around the world, a clay fraction range of 0.02-0.56, a sand fraction range of 0.05-0.92, and about 30,000 acquisition times. SEE is modeled using a soil resistance (rss) formulation based on surface soil moisture (θ) and two resistance parameters rss,ref and θefolding. The data-driven approach aims to express both parameters as a function of observable data including meteorological forcing, cut-off soil moisture value θ1/2 at which SEE=0.5, and first derivative of SEE at θ1/2, named Δθ1/2-1. An analytical relationship between >(rss,ref;θefolding) and >(θ1/2;Δθ1/2-1>) is first built by running a soil energy balance model for two extreme conditions with rss = 0 and rss˜∞ using meteorological forcing solely, and by approaching the middle point from the two (wet and dry) reference points. Two different methods are then investigated to estimate the pair >(θ1/2;Δθ1/2-1>) either from the time series of SEE and θ observations for a given site, or using the soil texture information for all sites. The first method is based on an algorithm specifically designed to accomodate for strongly nonlinear SEE>(θ>) relationships and potentially large random deviations of observed SEE from the mean observed SEE>(θ>). The second method parameterizes θ1/2 as a multi-linear regression of clay and sand percentages, and sets Δθ1/2-1 to a constant mean value for all sites. The new model significantly outperformed the evaporation modules of ISBA (Interaction Sol-Biosphère-Atmosphère), H-TESSEL (Hydrology-Tiled ECMWF Scheme for Surface Exchange over Land), and CLM (Community Land Model). It has potential for integration in various land-surface schemes, and real calibration capabilities using combined thermal and microwave remote sensing data.
BMI cyberworkstation: enabling dynamic data-driven brain-machine interface research through cyberinfrastructure.

PubMed

Zhao, Ming; Rattanatamrong, Prapaporn; DiGiovanna, Jack; Mahmoudi, Babak; Figueiredo, Renato J; Sanchez, Justin C; Príncipe, José C; Fortes, José A B

2008-01-01

Dynamic data-driven brain-machine interfaces (DDDBMI) have great potential to advance the understanding of neural systems and improve the design of brain-inspired rehabilitative systems. This paper presents a novel cyberinfrastructure that couples in vivo neurophysiology experimentation with massive computational resources to provide seamless and efficient support of DDDBMI research. Closed-loop experiments can be conducted with in vivo data acquisition, reliable network transfer, parallel model computation, and real-time robot control. Behavioral experiments with live animals are supported with real-time guarantees. Offline studies can be performed with various configurations for extensive analysis and training. A Web-based portal is also provided to allow users to conveniently interact with the cyberinfrastructure, conducting both experimentation and analysis. New motor control models are developed based on this approach, which include recursive least square based (RLS) and reinforcement learning based (RLBMI) algorithms. The results from an online RLBMI experiment shows that the cyberinfrastructure can successfully support DDDBMI experiments and meet the desired real-time requirements.
Use case driven approach to develop simulation model for PCS of APR1400 simulator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dong Wook, Kim; Hong Soo, Kim; Hyeon Tae, Kang

2006-07-01

The full-scope simulator is being developed to evaluate specific design feature and to support the iterative design and validation in the Man-Machine Interface System (MMIS) design of Advanced Power Reactor (APR) 1400. The simulator consists of process model, control logic model, and MMI for the APR1400 as well as the Power Control System (PCS). In this paper, a use case driven approach is proposed to develop a simulation model for PCS. In this approach, a system is considered from the point of view of its users. User's view of the system is based on interactions with the system and themore » resultant responses. In use case driven approach, we initially consider the system as a black box and look at its interactions with the users. From these interactions, use cases of the system are identified. Then the system is modeled using these use cases as functions. Lower levels expand the functionalities of each of these use cases. Hence, starting from the topmost level view of the system, we proceeded down to the lowest level (the internal view of the system). The model of the system thus developed is use case driven. This paper will introduce the functionality of the PCS simulation model, including a requirement analysis based on use case and the validation result of development of PCS model. The PCS simulation model using use case will be first used during the full-scope simulator development for nuclear power plant and will be supplied to Shin-Kori 3 and 4 plant. The use case based simulation model development can be useful for the design and implementation of simulation models. (authors)« less

Towards a generalized energy prediction model for machine tools

PubMed Central

Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H.; Dornfeld, David A.; Helu, Moneer; Rachuri, Sudarsan

2017-01-01

Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process. PMID:28652687
Towards a generalized energy prediction model for machine tools.

PubMed

Bhinge, Raunak; Park, Jinkyoo; Law, Kincho H; Dornfeld, David A; Helu, Moneer; Rachuri, Sudarsan

2017-04-01

Energy prediction of machine tools can deliver many advantages to a manufacturing enterprise, ranging from energy-efficient process planning to machine tool monitoring. Physics-based, energy prediction models have been proposed in the past to understand the energy usage pattern of a machine tool. However, uncertainties in both the machine and the operating environment make it difficult to predict the energy consumption of the target machine reliably. Taking advantage of the opportunity to collect extensive, contextual, energy-consumption data, we discuss a data-driven approach to develop an energy prediction model of a machine tool in this paper. First, we present a methodology that can efficiently and effectively collect and process data extracted from a machine tool and its sensors. We then present a data-driven model that can be used to predict the energy consumption of the machine tool for machining a generic part. Specifically, we use Gaussian Process (GP) Regression, a non-parametric machine-learning technique, to develop the prediction model. The energy prediction model is then generalized over multiple process parameters and operations. Finally, we apply this generalized model with a method to assess uncertainty intervals to predict the energy consumed to machine any part using a Mori Seiki NVD1500 machine tool. Furthermore, the same model can be used during process planning to optimize the energy-efficiency of a machining process.
Modeling Uncertainties in EEG Microstates: Analysis of Real and Imagined Motor Movements Using Probabilistic Clustering-Driven Training of Probabilistic Neural Networks.

PubMed

Dinov, Martin; Leech, Robert

2017-01-01

Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses.
Modeling Uncertainties in EEG Microstates: Analysis of Real and Imagined Motor Movements Using Probabilistic Clustering-Driven Training of Probabilistic Neural Networks

PubMed Central

Dinov, Martin; Leech, Robert

2017-01-01

Part of the process of EEG microstate estimation involves clustering EEG channel data at the global field power (GFP) maxima, very commonly using a modified K-means approach. Clustering has also been done deterministically, despite there being uncertainties in multiple stages of the microstate analysis, including the GFP peak definition, the clustering itself and in the post-clustering assignment of microstates back onto the EEG timecourse of interest. We perform a fully probabilistic microstate clustering and labeling, to account for these sources of uncertainty using the closest probabilistic analog to KM called Fuzzy C-means (FCM). We train softmax multi-layer perceptrons (MLPs) using the KM and FCM-inferred cluster assignments as target labels, to then allow for probabilistic labeling of the full EEG data instead of the usual correlation-based deterministic microstate label assignment typically used. We assess the merits of the probabilistic analysis vs. the deterministic approaches in EEG data recorded while participants perform real or imagined motor movements from a publicly available data set of 109 subjects. Though FCM group template maps that are almost topographically identical to KM were found, there is considerable uncertainty in the subsequent assignment of microstate labels. In general, imagined motor movements are less predictable on a time point-by-time point basis, possibly reflecting the more exploratory nature of the brain state during imagined, compared to during real motor movements. We find that some relationships may be more evident using FCM than using KM and propose that future microstate analysis should preferably be performed probabilistically rather than deterministically, especially in situations such as with brain computer interfaces, where both training and applying models of microstates need to account for uncertainty. Probabilistic neural network-driven microstate assignment has a number of advantages that we have discussed, which are likely to be further developed and exploited in future studies. In conclusion, probabilistic clustering and a probabilistic neural network-driven approach to microstate analysis is likely to better model and reveal details and the variability hidden in current deterministic and binarized microstate assignment and analyses. PMID:29163110
Rainfall extremes, weather and climatic characterization over complex terrain: A data-driven approach based on signal enhancement methods and extreme value modeling

NASA Astrophysics Data System (ADS)

Pineda, Luis E.; Willems, Patrick

2017-04-01

Weather and climatic characterization of rainfall extremes is both of scientific and societal value for hydrometeorogical risk management, yet discrimination of local and large-scale forcing remains challenging in data-scarce and complex terrain environments. Here, we present an analysis framework that separate weather (seasonal) regimes and climate (inter-annual) influences using data-driven process identification. The approach is based on signal-to-noise separation methods and extreme value (EV) modeling of multisite rainfall extremes. The EV models use a semi-automatic parameter learning [1] for model identification across temporal scales. At weather scale, the EV models are combined with a state-based hidden Markov model [2] to represent the spatio-temporal structure of rainfall as persistent weather states. At climatic scale, the EV models are used to decode the drivers leading to the shift of weather patterns. The decoding is performed into a climate-to-weather signal subspace, built via dimension reduction of climate model proxies (e.g. sea surface temperature and atmospheric circulation) We apply the framework to the Western Andean Ridge (WAR) in Ecuador and Peru (0-6°S) using ground data from the second half of the 20th century. We find that the meridional component of winds is what matters for the in-year and inter-annual variability of high rainfall intensities alongside the northern WAR (0-2.5°S). There, low-level southerly winds are found as advection drivers for oceanic moist of the normal-rainy season and weak/moderate the El Niño (EN) type; but, the strong EN type and its unique moisture surplus is locally advected at lowlands in the central WAR. Moreover, the coastal ridges, south of 3°S dampen meridional airflows, leaving local hygrothermal gradients to control the in-year distribution of rainfall extremes and their anomalies. Overall, we show that the framework, which does not make any prior assumption on the explanatory power of the weather and climate drivers, allows identification of well-known features of the regional climate in a purely data-driven fashion. Thus, this approach shows potential for characterization of precipitation extremes in data-scarce and orographically complex regions in which model reconstructions are the only climate proxies References [1] Mínguez, R., F.J. Méndez, C. Izaguirre, M. Menéndez, and I.J. Losada (2010), Pseudooptimal parameter selection of non-stationary generalized extreme value models for environmental variables, Environ. Modell. Softw. 25, 1592-1607. [2] Pineda, L., P. Willems (2016), Multisite Downscaling of Seasonal Predictions to Daily Rainfall Characteristics over Pacific-Andean River Basins in Ecuador and Peru using a non-homogenous hidden Markov model, J. Hydrometeor, 17(2), 481-498, doi:10.1175/JHM-D-15-0040.1, http://journals.ametsoc.org/doi/full/10.1175/JHM-D-15-0040.1
Numerical modeling of the solar wind flow with observational boundary conditions

DOE PAGES

Pogorelov, N. V.; Borovikov, S. N.; Burlaga, L. F.; ...

2012-11-20

In this paper we describe our group efforts to develop a self-consistent, data-driven model of the solar wind (SW) interaction with the local interstellar medium. The motion of plasma in this model is described with the MHD approach, while the transport of neutral atoms is addressed by either kinetic or multi-fluid equations. The model and its implementation in the Multi-Scale Fluid-Kinetic Simulation Suite (MS-FLUKSS) are continuously tested and validated by comparing our results with other models and spacecraft measurements. In particular, it was successfully applied to explain an unusual SW behavior discovered by the Voyager 1 spacecraft, i.e., the developmentmore » of a substantial negative radial velocity component, flow turning in the transverse direction, while the latitudinal velocity component goes to very small values. We explain recent SW velocity measurements at Voyager 1 in the context of our 3-D, MHD modeling. We also present a comparison of different turbulence models in their ability to reproduce the SW temperature profile from Voyager 2 measurements. Lastly, the boundary conditions obtained at 50 solar radii from data-driven numerical simulations are used to model a CME event throughout the heliosphere.« less
Toward Computational Cumulative Biology by Combining Models of Biological Datasets

PubMed Central

Faisal, Ali; Peltonen, Jaakko; Georgii, Elisabeth; Rung, Johan; Kaski, Samuel

2014-01-01

A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations—for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database. PMID:25427176
Toward computational cumulative biology by combining models of biological datasets.

PubMed

Faisal, Ali; Peltonen, Jaakko; Georgii, Elisabeth; Rung, Johan; Kaski, Samuel

2014-01-01

A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher's experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations-for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.
High-fidelity numerical modeling of the Upper Mississippi River under extreme flood condition

NASA Astrophysics Data System (ADS)

Khosronejad, Ali; Le, Trung; DeWall, Petra; Bartelt, Nicole; Woldeamlak, Solomon; Yang, Xiaolei; Sotiropoulos, Fotis

2016-12-01

We present data-driven numerical simulations of extreme flooding in a large-scale river coupling coherent-structure resolving hydrodynamics with bed morphodynamics under live-bed conditions. The study area is a ∼ 3.2 km long and ∼ 300 m wide reach of the Upper Mississippi River, near Minneapolis MN, which contains several natural islands and man-made hydraulic structures. We employ the large-eddy simulation (LES) and bed-morphodynamic modules of the Virtual Flow Simulator (VFS-Rivers) model, a recently developed in-house code, to investigate the flow and bed evolution of the river during a 100-year flood event. The coupling of the two modules is carried out via a fluid-structure interaction approach using a nested domain approach to enhance the resolution of bridge scour predictions. We integrate data from airborne Light Detection and Ranging (LiDAR), sub-aqueous sonar apparatus on-board a boat and in-situ laser scanners to construct a digital elevation model of the river bathymetry and surrounding flood plain, including islands and bridge piers. A field campaign under base-flow condition is also carried out to collect mean flow measurements via Acoustic Doppler Current Profiler (ADCP) to validate the hydrodynamic module of the VFS-Rivers model. Our simulation results for the bed evolution of the river under the 100-year flood reveal complex sediment transport dynamics near the bridge piers consisting of both scour and refilling events due to the continuous passage of sand dunes. We find that the scour depth near the bridge piers can reach to a maximum of ∼ 9 m. The data-driven simulation strategy we present in this work exemplifies a practical simulation-based-engineering-approach to investigate the resilience of infrastructures to extreme flood events in intricate field-scale riverine systems.
Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier-Stokes simulations: A data-driven, physics-informed Bayesian approach

NASA Astrophysics Data System (ADS)

Xiao, H.; Wu, J.-L.; Wang, J.-X.; Sun, R.; Roy, C. J.

2016-11-01

Despite their well-known limitations, Reynolds-Averaged Navier-Stokes (RANS) models are still the workhorse tools for turbulent flow simulations in today's engineering analysis, design and optimization. While the predictive capability of RANS models depends on many factors, for many practical flows the turbulence models are by far the largest source of uncertainty. As RANS models are used in the design and safety evaluation of many mission-critical systems such as airplanes and nuclear power plants, quantifying their model-form uncertainties has significant implications in enabling risk-informed decision-making. In this work we develop a data-driven, physics-informed Bayesian framework for quantifying model-form uncertainties in RANS simulations. Uncertainties are introduced directly to the Reynolds stresses and are represented with compact parameterization accounting for empirical prior knowledge and physical constraints (e.g., realizability, smoothness, and symmetry). An iterative ensemble Kalman method is used to assimilate the prior knowledge and observation data in a Bayesian framework, and to propagate them to posterior distributions of velocities and other Quantities of Interest (QoIs). We use two representative cases, the flow over periodic hills and the flow in a square duct, to evaluate the performance of the proposed framework. Both cases are challenging for standard RANS turbulence models. Simulation results suggest that, even with very sparse observations, the obtained posterior mean velocities and other QoIs have significantly better agreement with the benchmark data compared to the baseline results. At most locations the posterior distribution adequately captures the true model error within the developed model form uncertainty bounds. The framework is a major improvement over existing black-box, physics-neutral methods for model-form uncertainty quantification, where prior knowledge and details of the models are not exploited. This approach has potential implications in many fields in which the governing equations are well understood but the model uncertainty comes from unresolved physical processes.
Cognitive Effects of Mindfulness Training: Results of a Pilot Study Based on a Theory Driven Approach

PubMed Central

Wimmer, Lena; Bellingrath, Silja; von Stockhausen, Lisa

2016-01-01

The present paper reports a pilot study which tested cognitive effects of mindfulness practice in a theory-driven approach. Thirty-four fifth graders received either a mindfulness training which was based on the mindfulness-based stress reduction approach (experimental group), a concentration training (active control group), or no treatment (passive control group). Based on the operational definition of mindfulness by Bishop et al. (2004), effects on sustained attention, cognitive flexibility, cognitive inhibition, and data-driven as opposed to schema-based information processing were predicted. These abilities were assessed in a pre-post design by means of a vigilance test, a reversible figures test, the Wisconsin Card Sorting Test, a Stroop test, a visual search task, and a recognition task of prototypical faces. Results suggest that the mindfulness training specifically improved cognitive inhibition and data-driven information processing. PMID:27462287
Cognitive Effects of Mindfulness Training: Results of a Pilot Study Based on a Theory Driven Approach.

PubMed

Wimmer, Lena; Bellingrath, Silja; von Stockhausen, Lisa

2016-01-01

The present paper reports a pilot study which tested cognitive effects of mindfulness practice in a theory-driven approach. Thirty-four fifth graders received either a mindfulness training which was based on the mindfulness-based stress reduction approach (experimental group), a concentration training (active control group), or no treatment (passive control group). Based on the operational definition of mindfulness by Bishop et al. (2004), effects on sustained attention, cognitive flexibility, cognitive inhibition, and data-driven as opposed to schema-based information processing were predicted. These abilities were assessed in a pre-post design by means of a vigilance test, a reversible figures test, the Wisconsin Card Sorting Test, a Stroop test, a visual search task, and a recognition task of prototypical faces. Results suggest that the mindfulness training specifically improved cognitive inhibition and data-driven information processing.
Evaluating mortality rates with a novel integrated framework for nonmonogamous species.

PubMed

Tenan, Simone; Iemma, Aaron; Bragalanti, Natalia; Pedrini, Paolo; De Barba, Marta; Randi, Ettore; Groff, Claudio; Genovart, Meritxell

2016-12-01

The conservation of wildlife requires management based on quantitative evidence, and especially for large carnivores, unraveling cause-specific mortalities and understanding their impact on population dynamics is crucial. Acquiring this knowledge is challenging because it is difficult to obtain robust long-term data sets on endangered populations and, usually, data are collected through diverse sampling strategies. Integrated population models (IPMs) offer a way to integrate data generated through different processes. However, IPMs are female-based models that cannot account for mate availability, and this feature limits their applicability to monogamous species only. We extended classical IPMs to a two-sex framework that allows investigation of population dynamics and quantification of cause-specific mortality rates in nonmonogamous species. We illustrated our approach by simultaneously modeling different types of data from a reintroduced, unhunted brown bear (Ursus arctos) population living in an area with a dense human population. In a population mainly driven by adult survival, we estimated that on average 11% of cubs and 61% of adults died from human-related causes. Although the population is currently not at risk, adult survival and thus population dynamics are driven by anthropogenic mortality. Given the recent increase of human-bear conflicts in the area, removal of individuals for management purposes and through poaching may increase, reversing the positive population growth rate. Our approach can be generalized to other species affected by cause-specific mortality and will be useful to inform conservation decisions for other nonmonogamous species, such as most large carnivores, for which data are scarce and diverse and thus data integration is highly desirable. © 2016 Society for Conservation Biology.
Towards Current Profile Control in ITER: Potential Approaches and Research Needs

NASA Astrophysics Data System (ADS)

Schuster, E.; Barton, J. E.; Wehner, W. P.

2014-10-01

Many challenging plasma control problems still need to be addressed in order for the ITER Plasma Control System (PCS) to be able to successfully achieve the ITER project goals. For instance, setting up a suitable toroidal current density profile is key for one possible advanced scenario characterized by noninductive sustainment of the plasma current and steady-state operation. The nonlinearity and high dimensionality exhibited by the plasma demand a model-based current-profile control synthesis procedure that can accommodate this complexity through embedding the known physics within the design. The development of a model capturing the dynamics of the plasma relevant for control design enables not only the design of feedback controllers for regulation or tracking but also the design of optimal feedforward controllers for a systematic model-based approach to scenario planning, the design of state estimators for a reliable real-time reconstruction of the plasma internal profiles based on limited and noisy diagnostics, and the development of a fast predictive simulation code for closed-loop performance evaluation before implementation. Progress towards control-oriented modeling of the current profile evolution and associated control design has been reported following both data-driven and first-principles-driven approaches. An overview of these two approaches will be provided, as well as a discussion on research needs associated with each one of the model applications described above. Supported by the US Department of Energy under DE-SC0001334 and DE-SC0010661.
A Pareto-optimal moving average multigene genetic programming model for daily streamflow prediction

NASA Astrophysics Data System (ADS)

Danandeh Mehr, Ali; Kahya, Ercan

2017-06-01

Genetic programming (GP) is able to systematically explore alternative model structures of different accuracy and complexity from observed input and output data. The effectiveness of GP in hydrological system identification has been recognized in recent studies. However, selecting a parsimonious (accurate and simple) model from such alternatives still remains a question. This paper proposes a Pareto-optimal moving average multigene genetic programming (MA-MGGP) approach to develop a parsimonious model for single-station streamflow prediction. The three main components of the approach that take us from observed data to a validated model are: (1) data pre-processing, (2) system identification and (3) system simplification. The data pre-processing ingredient uses a simple moving average filter to diminish the lagged prediction effect of stand-alone data-driven models. The multigene ingredient of the model tends to identify the underlying nonlinear system with expressions simpler than classical monolithic GP and, eventually simplification component exploits Pareto front plot to select a parsimonious model through an interactive complexity-efficiency trade-off. The approach was tested using the daily streamflow records from a station on Senoz Stream, Turkey. Comparing to the efficiency results of stand-alone GP, MGGP, and conventional multi linear regression prediction models as benchmarks, the proposed Pareto-optimal MA-MGGP model put forward a parsimonious solution, which has a noteworthy importance of being applied in practice. In addition, the approach allows the user to enter human insight into the problem to examine evolved models and pick the best performing programs out for further analysis.
A novel model for simulating the racing effect in capillary-driven underfill process in flip chip

NASA Astrophysics Data System (ADS)

Zhu, Wenhui; Wang, Kanglun; Wang, Yan

2018-04-01

Underfill is typically applied in flip chips to increase the reliability of the electronic packagings. In this paper, the evolution of the melt-front shape of the capillary-driven underfill flow is studied through 3D numerical analysis. Two different models, the prevailing surface force model and the capillary model based on the wetted wall boundary condition, are introduced to test their applicability, where level set method is used to track the interface of the two phase flow. The comparison between the simulation results and experimental data indicates that, the surface force model produces better prediction on the melt-front shape, especially in the central area of the flip chip. Nevertheless, the two above models cannot simulate properly the racing effect phenomenon that appears during underfill encapsulation. A novel ‘dynamic pressure boundary condition’ method is proposed based on the validated surface force model. Utilizing this approach, the racing effect phenomenon is simulated with high precision. In addition, a linear relationship is derived from this model between the flow front location at the edge of the flip chip and the filling time. Using the proposed approach, the impact of the underfill-dispensing length on the melt-front shape is also studied.
Study of the Influence of Age in 18F-FDG PET Images Using a Data-Driven Approach and Its Evaluation in Alzheimer's Disease.

PubMed

Jiang, Jiehui; Sun, Yiwu; Zhou, Hucheng; Li, Shaoping; Huang, Zhemin; Wu, Ping; Shi, Kuangyu; Zuo, Chuantao; Neuroimaging Initiative, Alzheimer's Disease

2018-01-01

18 F-FDG PET scan is one of the most frequently used neural imaging scans. However, the influence of age has proven to be the greatest interfering factor for many clinical dementia diagnoses when analyzing 18 F-FDG PET images, since radiologists encounter difficulties when deciding whether the abnormalities in specific regions correlate with normal aging, disease, or both. In the present paper, the authors aimed to define specific brain regions and determine an age-correction mathematical model. A data-driven approach was used based on 255 healthy subjects. The inferior frontal gyrus, the left medial part and the left medial orbital part of superior frontal gyrus, the right insula, the left anterior cingulate, the left median cingulate, and paracingulate gyri, and bilateral superior temporal gyri were found to have a strong negative correlation with age. For evaluation, an age-correction model was applied to 262 healthy subjects and 50 AD subjects selected from the ADNI database, and partial correlations between SUVR mean and three clinical results were carried out before and after age correction. All correlation coefficients were significantly improved after the age correction. The proposed model was effective in the age correction of both healthy and AD subjects.
Three-dimensional Cascaded Lattice Boltzmann Model for Thermal Convective Flows

NASA Astrophysics Data System (ADS)

Hajabdollahi, Farzaneh; Premnath, Kannan

2017-11-01

Fluid motion driven by thermal effects, such as due to buoyancy in differentially heated enclosures arise in several natural and industrial settings, whose understanding can be achieved via numerical simulations. Lattice Boltzmann (LB) methods are efficient kinetic computational approaches for coupled flow physics problems. In this study, we develop three-dimensional (3D) LB models based on central moments and multiple relaxation times for D3Q7 and D3Q15 lattices to solve the energy transport equations in a double distribution function approach. Their collision operators lead to a cascaded structure involving higher order terms resulting in improved stability. This is coupled to a central moment based LB flow solver with source terms. The new 3D cascaded LB models for the convective flows are first validated for natural convection of air driven thermally on two vertically opposite faces in a cubic cavity at different Rayleigh numbers against prior numerical and experimental data, which show good quantitative agreement. Then, the detailed structure of the 3D flow and thermal fields and the heat transfer rates at different Rayleigh numbers are analyzed and interpreted.
Using subject-specific three-dimensional (3D) anthropometry data in digital human modelling: case study in hand motion simulation.

PubMed

Tsao, Liuxing; Ma, Liang

2016-11-01

Digital human modelling enables ergonomists and designers to consider ergonomic concerns and design alternatives in a timely and cost-efficient manner in the early stages of design. However, the reliability of the simulation could be limited due to the percentile-based approach used in constructing the digital human model. To enhance the accuracy of the size and shape of the models, we proposed a framework to generate digital human models using three-dimensional (3D) anthropometric data. The 3D scan data from specific subjects' hands were segmented based on the estimated centres of rotation. The segments were then driven in forward kinematics to perform several functional postures. The constructed hand models were then verified, thereby validating the feasibility of the framework. The proposed framework helps generate accurate subject-specific digital human models, which can be utilised to guide product design and workspace arrangement. Practitioner Summary: Subject-specific digital human models can be constructed under the proposed framework based on three-dimensional (3D) anthropometry. This approach enables more reliable digital human simulation to guide product design and workspace arrangement.
The ABC of stereotypes about groups: Agency/socioeconomic success, conservative-progressive beliefs, and communion.

PubMed

Koch, Alex; Imhoff, Roland; Dotsch, Ron; Unkelbach, Christian; Alves, Hans

2016-05-01

Previous research argued that stereotypes differ primarily on the 2 dimensions of warmth/communion and competence/agency. We identify an empirical gap in support for this notion. The theoretical model constrains stereotypes a priori to these 2 dimensions; without this constraint, participants might spontaneously employ other relevant dimensions. We fill this gap by complementing the existing theory-driven approaches with a data-driven approach that allows an estimation of the spontaneously employed dimensions of stereotyping. Seven studies (total N = 4,451) show that people organize social groups primarily based on their agency/socioeconomic success (A), and as a second dimension, based on their conservative-progressive beliefs (B). Communion (C) is not found as a dimension by its own, but rather as an emergent quality in the two-dimensional space of A and B, resulting in a 2D ABC model of stereotype content about social groups. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.

PubMed

Taylor, R Andrew; Pare, Joseph R; Venkatesh, Arjun K; Mowafi, Hani; Melnick, Edward R; Fleischman, William; Hall, M Kennedy

2016-03-01

Predictive analytics in emergency care has mostly been limited to the use of clinical decision rules (CDRs) in the form of simple heuristics and scoring systems. In the development of CDRs, limitations in analytic methods and concerns with usability have generally constrained models to a preselected small set of variables judged to be clinically relevant and to rules that are easily calculated. Furthermore, CDRs frequently suffer from questions of generalizability, take years to develop, and lack the ability to be updated as new information becomes available. Newer analytic and machine learning techniques capable of harnessing the large number of variables that are already available through electronic health records (EHRs) may better predict patient outcomes and facilitate automation and deployment within clinical decision support systems. In this proof-of-concept study, a local, big data-driven, machine learning approach is compared to existing CDRs and traditional analytic methods using the prediction of sepsis in-hospital mortality as the use case. This was a retrospective study of adult ED visits admitted to the hospital meeting criteria for sepsis from October 2013 to October 2014. Sepsis was defined as meeting criteria for systemic inflammatory response syndrome with an infectious admitting diagnosis in the ED. ED visits were randomly partitioned into an 80%/20% split for training and validation. A random forest model (machine learning approach) was constructed using over 500 clinical variables from data available within the EHRs of four hospitals to predict in-hospital mortality. The machine learning prediction model was then compared to a classification and regression tree (CART) model, logistic regression model, and previously developed prediction tools on the validation data set using area under the receiver operating characteristic curve (AUC) and chi-square statistics. There were 5,278 visits among 4,676 unique patients who met criteria for sepsis. Of the 4,222 patients in the training group, 210 (5.0%) died during hospitalization, and of the 1,056 patients in the validation group, 50 (4.7%) died during hospitalization. The AUCs with 95% confidence intervals (CIs) for the different models were as follows: random forest model, 0.86 (95% CI = 0.82 to 0.90); CART model, 0.69 (95% CI = 0.62 to 0.77); logistic regression model, 0.76 (95% CI = 0.69 to 0.82); CURB-65, 0.73 (95% CI = 0.67 to 0.80); MEDS, 0.71 (95% CI = 0.63 to 0.77); and mREMS, 0.72 (95% CI = 0.65 to 0.79). The random forest model AUC was statistically different from all other models (p ≤ 0.003 for all comparisons). In this proof-of-concept study, a local big data-driven, machine learning approach outperformed existing CDRs as well as traditional analytic techniques for predicting in-hospital mortality of ED patients with sepsis. Future research should prospectively evaluate the effectiveness of this approach and whether it translates into improved clinical outcomes for high-risk sepsis patients. The methods developed serve as an example of a new model for predictive analytics in emergency care that can be automated, applied to other clinical outcomes of interest, and deployed in EHRs to enable locally relevant clinical predictions. © 2015 by the Society for Academic Emergency Medicine.
The Prediction of the Gas Utilization Ratio Based on TS Fuzzy Neural Network and Particle Swarm Optimization

PubMed Central

Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

2018-01-01

Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control. PMID:29461469
The Prediction of the Gas Utilization Ratio based on TS Fuzzy Neural Network and Particle Swarm Optimization.

PubMed

Zhang, Sen; Jiang, Haihe; Yin, Yixin; Xiao, Wendong; Zhao, Baoyong

2018-02-20

Gas utilization ratio (GUR) is an important indicator that is used to evaluate the energy consumption of blast furnaces (BFs). Currently, the existing methods cannot predict the GUR accurately. In this paper, we present a novel data-driven model for predicting the GUR. The proposed approach utilized both the TS fuzzy neural network (TS-FNN) and the particle swarm algorithm (PSO) to predict the GUR. The particle swarm algorithm (PSO) is applied to optimize the parameters of the TS-FNN in order to decrease the error caused by the inaccurate initial parameter. This paper also applied the box graph (Box-plot) method to eliminate the abnormal value of the raw data during the data preprocessing. This method can deal with the data which does not obey the normal distribution which is caused by the complex industrial environments. The prediction results demonstrate that the optimization model based on PSO and the TS-FNN approach achieves higher prediction accuracy compared with the TS-FNN model and SVM model and the proposed approach can accurately predict the GUR of the blast furnace, providing an effective way for the on-line blast furnace distribution control.
A model-driven approach to information security compliance

NASA Astrophysics Data System (ADS)

Correia, Anacleto; Gonçalves, António; Teodoro, M. Filomena

2017-06-01

The availability, integrity and confidentiality of information are fundamental to the long-term survival of any organization. Information security is a complex issue that must be holistically approached, combining assets that support corporate systems, in an extended network of business partners, vendors, customers and other stakeholders. This paper addresses the conception and implementation of information security systems, conform the ISO/IEC 27000 set of standards, using the model-driven approach. The process begins with the conception of a domain level model (computation independent model) based on information security vocabulary present in the ISO/IEC 27001 standard. Based on this model, after embedding in the model mandatory rules for attaining ISO/IEC 27001 conformance, a platform independent model is derived. Finally, a platform specific model serves the base for testing the compliance of information security systems with the ISO/IEC 27000 set of standards.
Nonlinear data-driven identification of polymer electrolyte membrane fuel cells for diagnostic purposes: A Volterra series approach

NASA Astrophysics Data System (ADS)

Ritzberger, D.; Jakubek, S.

2017-09-01

In this work, a data-driven identification method, based on polynomial nonlinear autoregressive models with exogenous inputs (NARX) and the Volterra series, is proposed to describe the dynamic and nonlinear voltage and current characteristics of polymer electrolyte membrane fuel cells (PEMFCs). The structure selection and parameter estimation of the NARX model is performed on broad-band voltage/current data. By transforming the time-domain NARX model into a Volterra series representation using the harmonic probing algorithm, a frequency-domain description of the linear and nonlinear dynamics is obtained. With the Volterra kernels corresponding to different operating conditions, information from existing diagnostic tools in the frequency domain such as electrochemical impedance spectroscopy (EIS) and total harmonic distortion analysis (THDA) are effectively combined. Additionally, the time-domain NARX model can be utilized for fault detection by evaluating the difference between measured and simulated output. To increase the fault detectability, an optimization problem is introduced which maximizes this output residual to obtain proper excitation frequencies. As a possible extension it is shown, that by optimizing the periodic signal shape itself that the fault detectability is further increased.
Linking Goal-Oriented Requirements and Model-Driven Development

NASA Astrophysics Data System (ADS)

Pastor, Oscar; Giachetti, Giovanni

In the context of Goal-Oriented Requirement Engineering (GORE) there are interesting modeling approaches for the analysis of complex scenarios that are oriented to obtain and represent the relevant requirements for the development of software products. However, the way to use these GORE models in an automated Model-Driven Development (MDD) process is not clear, and, in general terms, the translation of these models into the final software products is still manually performed. Therefore, in this chapter, we show an approach to automatically link GORE models and MDD processes, which has been elaborated by considering the experience obtained from linking the i * framework with an industrially applied MDD approach. The linking approach proposed is formulated by means of a generic process that is based on current modeling standards and technologies in order to facilitate its application for different MDD and GORE approaches. Special attention is paid to how this process generates appropriate model transformation mechanisms to automatically obtain MDD conceptual models from GORE models, and how it can be used to specify validation mechanisms to assure the correct model transformations.
A new approach to hazardous materials transportation risk analysis: decision modeling to identify critical variables.

PubMed

Clark, Renee M; Besterfield-Sacre, Mary E

2009-03-01

We take a novel approach to analyzing hazardous materials transportation risk in this research. Previous studies analyzed this risk from an operations research (OR) or quantitative risk assessment (QRA) perspective by minimizing or calculating risk along a transport route. Further, even though the majority of incidents occur when containers are unloaded, the research has not focused on transportation-related activities, including container loading and unloading. In this work, we developed a decision model of a hazardous materials release during unloading using actual data and an exploratory data modeling approach. Previous studies have had a theoretical perspective in terms of identifying and advancing the key variables related to this risk, and there has not been a focus on probability and statistics-based approaches for doing this. Our decision model empirically identifies the critical variables using an exploratory methodology for a large, highly categorical database involving latent class analysis (LCA), loglinear modeling, and Bayesian networking. Our model identified the most influential variables and countermeasures for two consequences of a hazmat incident, dollar loss and release quantity, and is one of the first models to do this. The most influential variables were found to be related to the failure of the container. In addition to analyzing hazmat risk, our methodology can be used to develop data-driven models for strategic decision making in other domains involving risk.
Gaussian process regression for forecasting battery state of health

NASA Astrophysics Data System (ADS)

Richardson, Robert R.; Osborne, Michael A.; Howey, David A.

2017-07-01

Accurately predicting the future capacity and remaining useful life of batteries is necessary to ensure reliable system operation and to minimise maintenance costs. The complex nature of battery degradation has meant that mechanistic modelling of capacity fade has thus far remained intractable; however, with the advent of cloud-connected devices, data from cells in various applications is becoming increasingly available, and the feasibility of data-driven methods for battery prognostics is increasing. Here we propose Gaussian process (GP) regression for forecasting battery state of health, and highlight various advantages of GPs over other data-driven and mechanistic approaches. GPs are a type of Bayesian non-parametric method, and hence can model complex systems whilst handling uncertainty in a principled manner. Prior information can be exploited by GPs in a variety of ways: explicit mean functions can be used if the functional form of the underlying degradation model is available, and multiple-output GPs can effectively exploit correlations between data from different cells. We demonstrate the predictive capability of GPs for short-term and long-term (remaining useful life) forecasting on a selection of capacity vs. cycle datasets from lithium-ion cells.
Live Speech Driven Head-and-Eye Motion Generators.

PubMed

Le, Binh H; Ma, Xiaohan; Deng, Zhigang

2012-11-01

This paper describes a fully automated framework to generate realistic head motion, eye gaze, and eyelid motion simultaneously based on live (or recorded) speech input. Its central idea is to learn separate yet interrelated statistical models for each component (head motion, gaze, or eyelid motion) from a prerecorded facial motion data set: 1) Gaussian Mixture Models and gradient descent optimization algorithm are employed to generate head motion from speech features; 2) Nonlinear Dynamic Canonical Correlation Analysis model is used to synthesize eye gaze from head motion and speech features, and 3) nonnegative linear regression is used to model voluntary eye lid motion and log-normal distribution is used to describe involuntary eye blinks. Several user studies are conducted to evaluate the effectiveness of the proposed speech-driven head and eye motion generator using the well-established paired comparison methodology. Our evaluation results clearly show that this approach can significantly outperform the state-of-the-art head and eye motion generation algorithms. In addition, a novel mocap+video hybrid data acquisition technique is introduced to record high-fidelity head movement, eye gaze, and eyelid motion simultaneously.
External radioactive markers for PET data-driven respiratory gating in positron emission tomography.

PubMed

Büther, Florian; Ernst, Iris; Hamill, James; Eich, Hans T; Schober, Otmar; Schäfers, Michael; Schäfers, Klaus P

2013-04-01

Respiratory gating is an established approach to overcoming respiration-induced image artefacts in PET. Of special interest in this respect are raw PET data-driven gating methods which do not require additional hardware to acquire respiratory signals during the scan. However, these methods rely heavily on the quality of the acquired PET data (statistical properties, data contrast, etc.). We therefore combined external radioactive markers with data-driven respiratory gating in PET/CT. The feasibility and accuracy of this approach was studied for [(18)F]FDG PET/CT imaging in patients with malignant liver and lung lesions. PET data from 30 patients with abdominal or thoracic [(18)F]FDG-positive lesions (primary tumours or metastases) were included in this prospective study. The patients underwent a 10-min list-mode PET scan with a single bed position following a standard clinical whole-body [(18)F]FDG PET/CT scan. During this scan, one to three radioactive point sources (either (22)Na or (18)F, 50-100 kBq) in a dedicated holder were attached the patient's abdomen. The list mode data acquired were retrospectively analysed for respiratory signals using established data-driven gating approaches and additionally by tracking the motion of the point sources in sinogram space. Gated reconstructions were examined qualitatively, in terms of the amount of respiratory displacement and in respect of changes in local image intensity in the gated images. The presence of the external markers did not affect whole-body PET/CT image quality. Tracking of the markers led to characteristic respiratory curves in all patients. Applying these curves for gated reconstructions resulted in images in which motion was well resolved. Quantitatively, the performance of the external marker-based approach was similar to that of the best intrinsic data-driven methods. Overall, the gain in measured tumour uptake from the nongated to the gated images indicating successful removal of respiratory motion was correlated with the magnitude of the respiratory displacement of the respective tumour lesion, but not with lesion size. Respiratory information can be assessed from list-mode PET/CT through PET data-derived tracking of external radioactive markers. This information can be successfully applied to respiratory gating to reduce motion-related image blurring. In contrast to other previously described PET data-driven approaches, the external marker approach is independent of tumour uptake and thereby applicable even in patients with poor uptake and small tumours.
The Cannon: A data-driven approach to Stellar Label Determination

NASA Astrophysics Data System (ADS)

Ness, M.; Hogg, David W.; Rix, H.-W.; Ho, Anna. Y. Q.; Zasowski, G.

2015-07-01

New spectroscopic surveys offer the promise of stellar parameters and abundances (“stellar labels”) for hundreds of thousands of stars; this poses a formidable spectral modeling challenge. In many cases, there is a subset of reference objects for which the stellar labels are known with high(er) fidelity. We take advantage of this with The Cannon, a new data-driven approach for determining stellar labels from spectroscopic data. The Cannon learns from the “known” labels of reference stars how the continuum-normalized spectra depend on these labels by fitting a flexible model at each wavelength; then, The Cannon uses this model to derive labels for the remaining survey stars. We illustrate The Cannon by training the model on only 542 stars in 19 clusters as reference objects, with {T}{eff}, {log} g, and [{Fe}/{{H}}] as the labels, and then applying it to the spectra of 55,000 stars from APOGEE DR10. The Cannon is very accurate. Its stellar labels compare well to the stars for which APOGEE pipeline (ASPCAP) labels are provided in DR10, with rms differences that are basically identical to the stated ASPCAP uncertainties. Beyond the reference labels, The Cannon makes no use of stellar models nor any line-list, but needs a set of reference objects that span label-space. The Cannon performs well at lower signal-to-noise, as it delivers comparably good labels even at one-ninth the APOGEE observing time. We discuss the limitations of The Cannon and its future potential, particularly, to bring different spectroscopic surveys onto a consistent scale of stellar labels.
Algebraic reasoning for the enhancement of data-driven building reconstructions

NASA Astrophysics Data System (ADS)

Meidow, Jochen; Hammer, Horst

2016-04-01

Data-driven approaches for the reconstruction of buildings feature the flexibility needed to capture objects of arbitrary shape. To recognize man-made structures, geometric relations such as orthogonality or parallelism have to be detected. These constraints are typically formulated as sets of multivariate polynomials. For the enforcement of the constraints within an adjustment process, a set of independent and consistent geometric constraints has to be determined. Gröbner bases are an ideal tool to identify such sets exactly. A complete workflow for geometric reasoning is presented to obtain boundary representations of solids based on given point clouds. The constraints are formulated in homogeneous coordinates, which results in simple polynomials suitable for the successful derivation of Gröbner bases for algebraic reasoning. Strategies for the reduction of the algebraical complexity are presented. To enforce the constraints, an adjustment model is introduced, which is able to cope with homogeneous coordinates along with their singular covariance matrices. The feasibility and the potential of the approach are demonstrated by the analysis of a real data set.
Complex networks as a unified framework for descriptive analysis and predictive modeling in climate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steinhaeuser, Karsten J K; Chawla, Nitesh; Ganguly, Auroop R

The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain. Specifically, we posit that complex networks are well-suited for both descriptive analysis and predictive modeling tasks. We show that the structural properties of climate networks have useful interpretation within the domain. Further,more » we extract clusters from these networks and demonstrate their predictive power as climate indices. Our experimental results establish that the network clusters are statistically significantly better predictors than clusters derived using a more traditional clustering approach. Using complex networks as data representation thus enables the unique opportunity for descriptive and predictive modeling to inform each other.« less
Putting the psychology back into psychological models: mechanistic versus rational approaches.

PubMed

Sakamoto, Yasuaki; Jones, Mattr; Love, Bradley C

2008-09-01

Two basic approaches to explaining the nature of the mind are the rational and the mechanistic approaches. Rational analyses attempt to characterize the environment and the behavioral outcomes that humans seek to optimize, whereas mechanistic models attempt to simulate human behavior using processes and representations analogous to those used by humans. We compared these approaches with regard to their accounts of how humans learn the variability of categories. The mechanistic model departs in subtle ways from rational principles. In particular, the mechanistic model incrementally updates its estimates of category means and variances through error-driven learning, based on discrepancies between new category members and the current representation of each category. The model yields a prediction, which we verify, regarding the effects of order manipulations that the rational approach does not anticipate. Although both rational and mechanistic models can successfully postdict known findings, we suggest that psychological advances are driven primarily by consideration of process and representation and that rational accounts trail these breakthroughs.
Learning and Information Approaches for Inference in Dynamic Data-Driven Geophysical Applications

NASA Astrophysics Data System (ADS)

Ravela, S.

2015-12-01

Many Geophysical inference problems are characterized by non-linear processes, high-dimensional models and complex uncertainties. A dynamic coupling between models, estimation, and sampling is typically sought to efficiently characterize and reduce uncertainty. This process is however fraught with several difficulties. Among them, the key difficulties are the ability to deal with model errors, efficacy of uncertainty quantification and data assimilation. In this presentation, we present three key ideas from learning and intelligent systems theory and apply them to two geophysical applications. The first idea is the use of Ensemble Learning to compensate for model error, the second is to develop tractable Information Theoretic Learning to deal with non-Gaussianity in inference, and the third is a Manifold Resampling technique for effective uncertainty quantification. We apply these methods, first to the development of a cooperative autonomous observing system using sUAS for studying coherent structures. We apply this to Second, we apply this to the problem of quantifying risk from hurricanes and storm surges in a changing climate. Results indicate that learning approaches can enable new effectiveness in cases where standard approaches to model reduction, uncertainty quantification and data assimilation fail.
WebGL Visualisation of 3D Environmental Models Based on Finnish Open Geospatial Data Sets

NASA Astrophysics Data System (ADS)

Krooks, A.; Kahkonen, J.; Lehto, L.; Latvala, P.; Karjalainen, M.; Honkavaara, E.

2014-08-01

Recent developments in spatial data infrastructures have enabled real time GIS analysis and visualization using open input data sources and service interfaces. In this study we present a new concept where metric point clouds derived from national open airborne laser scanning (ALS) and photogrammetric image data are processed, analyzed, finally visualised a through open service interfaces to produce user-driven analysis products from targeted areas. The concept is demonstrated in three environmental applications: assessment of forest storm damages, assessment of volumetric changes in open pit mine and 3D city model visualization. One of the main objectives was to study the usability and requirements of national level photogrammetric imagery in these applications. The results demonstrated that user driven 3D geospatial analyses were possible with the proposed approach and current technology, for instance, the landowner could assess the amount of fallen trees within his property borders after a storm easily using any web browser. On the other hand, our study indicated that there are still many uncertainties especially due to the insufficient standardization of photogrammetric products and processes and their quality indicators.
A composite computational model of liver glucose homeostasis. I. Building the composite model.

PubMed

Hetherington, J; Sumner, T; Seymour, R M; Li, L; Rey, M Varela; Yamaji, S; Saffrey, P; Margoninski, O; Bogle, I D L; Finkelstein, A; Warner, A

2012-04-07

A computational model of the glucagon/insulin-driven liver glucohomeostasis function, focusing on the buffering of glucose into glycogen, has been developed. The model exemplifies an 'engineering' approach to modelling in systems biology, and was produced by linking together seven component models of separate aspects of the physiology. The component models use a variety of modelling paradigms and degrees of simplification. Model parameters were determined by an iterative hybrid of fitting to high-scale physiological data, and determination from small-scale in vitro experiments or molecular biological techniques. The component models were not originally designed for inclusion within such a composite model, but were integrated, with modification, using our published modelling software and computational frameworks. This approach facilitates the development of large and complex composite models, although, inevitably, some compromises must be made when composing the individual models. Composite models of this form have not previously been demonstrated.
Recurrence measure of conditional dependence and applications.

PubMed

Ramos, Antônio M T; Builes-Jaramillo, Alejandro; Poveda, Germán; Goswami, Bedartha; Macau, Elbert E N; Kurths, Jürgen; Marwan, Norbert

2017-05-01

Identifying causal relations from observational data sets has posed great challenges in data-driven causality inference studies. One of the successful approaches to detect direct coupling in the information theory framework is transfer entropy. However, the core of entropy-based tools lies on the probability estimation of the underlying variables. Here we propose a data-driven approach for causality inference that incorporates recurrence plot features into the framework of information theory. We define it as the recurrence measure of conditional dependence (RMCD), and we present some applications. The RMCD quantifies the causal dependence between two processes based on joint recurrence patterns between the past of the possible driver and present of the potentially driven, excepting the contribution of the contemporaneous past of the driven variable. Finally, it can unveil the time scale of the influence of the sea-surface temperature of the Pacific Ocean on the precipitation in the Amazonia during recent major droughts.
Recurrence measure of conditional dependence and applications

NASA Astrophysics Data System (ADS)

Ramos, Antônio M. T.; Builes-Jaramillo, Alejandro; Poveda, Germán; Goswami, Bedartha; Macau, Elbert E. N.; Kurths, Jürgen; Marwan, Norbert

2017-05-01

Identifying causal relations from observational data sets has posed great challenges in data-driven causality inference studies. One of the successful approaches to detect direct coupling in the information theory framework is transfer entropy. However, the core of entropy-based tools lies on the probability estimation of the underlying variables. Here we propose a data-driven approach for causality inference that incorporates recurrence plot features into the framework of information theory. We define it as the recurrence measure of conditional dependence (RMCD), and we present some applications. The RMCD quantifies the causal dependence between two processes based on joint recurrence patterns between the past of the possible driver and present of the potentially driven, excepting the contribution of the contemporaneous past of the driven variable. Finally, it can unveil the time scale of the influence of the sea-surface temperature of the Pacific Ocean on the precipitation in the Amazonia during recent major droughts.
A data driven approach using Takagi-Sugeno models for computationally efficient lumped floodplain modelling

NASA Astrophysics Data System (ADS)

Wolfs, Vincent; Willems, Patrick

2013-10-01

Many applications in support of water management decisions require hydrodynamic models with limited calculation time, including real time control of river flooding, uncertainty and sensitivity analyses by Monte-Carlo simulations, and long term simulations in support of the statistical analysis of the model simulation results (e.g. flood frequency analysis). Several computationally efficient hydrodynamic models exist, but little attention is given to the modelling of floodplains. This paper presents a methodology that can emulate output from a full hydrodynamic model by predicting one or several levels in a floodplain, together with the flow rate between river and floodplain. The overtopping of the embankment is modelled as an overflow at a weir. Adaptive neuro fuzzy inference systems (ANFIS) are exploited to cope with the varying factors affecting the flow. Different input sets and identification methods are considered in model construction. Because of the dual use of simplified physically based equations and data-driven techniques, the ANFIS consist of very few rules with a low number of input variables. A second calculation scheme can be followed for exceptionally large floods. The obtained nominal emulation model was tested for four floodplains along the river Dender in Belgium. Results show that the obtained models are accurate with low computational cost.

Towards Data-Driven Simulations of Wildfire Spread using Ensemble-based Data Assimilation

NASA Astrophysics Data System (ADS)

Rochoux, M. C.; Bart, J.; Ricci, S. M.; Cuenot, B.; Trouvé, A.; Duchaine, F.; Morel, T.

2012-12-01

Real-time predictions of a propagating wildfire remain a challenging task because the problem involves both multi-physics and multi-scales. The propagation speed of wildfires, also called the rate of spread (ROS), is indeed determined by complex interactions between pyrolysis, combustion and flow dynamics, atmospheric dynamics occurring at vegetation, topographical and meteorological scales. Current operational fire spread models are mainly based on a semi-empirical parameterization of the ROS in terms of vegetation, topographical and meteorological properties. For the fire spread simulation to be predictive and compatible with operational applications, the uncertainty on the ROS model should be reduced. As recent progress made in remote sensing technology provides new ways to monitor the fire front position, a promising approach to overcome the difficulties found in wildfire spread simulations is to integrate fire modeling and fire sensing technologies using data assimilation (DA). For this purpose we have developed a prototype data-driven wildfire spread simulator in order to provide optimal estimates of poorly known model parameters [*]. The data-driven simulation capability is adapted for more realistic wildfire spread : it considers a regional-scale fire spread model that is informed by observations of the fire front location. An Ensemble Kalman Filter algorithm (EnKF) based on a parallel computing platform (OpenPALM) was implemented in order to perform a multi-parameter sequential estimation where wind magnitude and direction are in addition to vegetation properties (see attached figure). The EnKF algorithm shows its good ability to track a small-scale grassland fire experiment and ensures a good accounting for the sensitivity of the simulation outcomes to the control parameters. As a conclusion, it was shown that data assimilation is a promising approach to more accurately forecast time-varying wildfire spread conditions as new airborne-like observations of the fire front location get available. [*] Rochoux, M.C., Delmotte, B., Cuenot, B., Ricci, S., and Trouvé, A. (2012) "Regional-scale simulations of wildland fire spread informed by real-time flame front observations", Proc. Combust. Inst., 34, in press http://dx.doi.org/10.1016/j.proci.2012.06.090 EnKF-based tracking of small-scale grassland fire experiment, with estimation of wind and fuel parameters.
Three Research Strategies of Neuroscience and the Future of Legal Imaging Evidence.

PubMed

Jun, Jinkwon; Yoo, Soyoung

2018-01-01

Neuroscientific imaging evidence (NIE) has become an integral part of the criminal justice system in the United States. However, in most legal cases, NIE is submitted and used only to mitigate penalties because the court does not recognize it as substantial evidence, considering its lack of reliability. Nevertheless, we here discuss how neuroscience is expected to improve the use of NIE in the legal system. For this purpose, we classified the efforts of neuroscientists into three research strategies: cognitive subtraction, the data-driven approach, and the brain-manipulation approach. Cognitive subtraction is outdated and problematic; consequently, the court deemed it to be an inadequate approach in terms of legal evidence in 2012. In contrast, the data-driven and brain manipulation approaches, which are state-of-the-art approaches, have overcome the limitations of cognitive subtraction. The data-driven approach brings data science into the field and is benefiting immensely from the development of research platforms that allow automatized collection, analysis, and sharing of data. This broadens the scale of imaging evidence. The brain-manipulation approach uses high-functioning tools that facilitate non-invasive and precise human brain manipulation. These two approaches are expected to have synergistic effects. Neuroscience has strived to improve the evidential reliability of NIE, with considerable success. With the support of cutting-edge technologies, and the progress of these approaches, the evidential status of NIE will be improved and NIE will become an increasingly important part of legal practice.
A web-based data-querying tool based on ontology-driven methodology and flowchart-based model.

PubMed

Ping, Xiao-Ou; Chung, Yufang; Tseng, Yi-Ju; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

2013-10-08

Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting," and "treatments for liver cancer") was 100% for all four experiments (10 patients, 100 patients, 1000 patients, and 10,000 patients). Among the three measured query phases, (1) structured query language operations, (2) criteria verification, and (3) other, the first two had the longest execution time. The ontology-driven FBDQM-based approach enriched the capabilities of the data-querying system. The adoption of the GLIF3.5 increased the potential for interoperability, shareability, and reusability of the query tasks.
Bridging the divide: a model-data approach to Polar and Alpine microbiology.

PubMed

Bradley, James A; Anesio, Alexandre M; Arndt, Sandra

2016-03-01

Advances in microbial ecology in the cryosphere continue to be driven by empirical approaches including field sampling and laboratory-based analyses. Although mathematical models are commonly used to investigate the physical dynamics of Polar and Alpine regions, they are rarely applied in microbial studies. Yet integrating modelling approaches with ongoing observational and laboratory-based work is ideally suited to Polar and Alpine microbial ecosystems given their harsh environmental and biogeochemical characteristics, simple trophic structures, distinct seasonality, often difficult accessibility, geographical expansiveness and susceptibility to accelerated climate changes. In this opinion paper, we explain how mathematical modelling ideally complements field and laboratory-based analyses. We thus argue that mathematical modelling is a powerful tool for the investigation of these extreme environments and that fully integrated, interdisciplinary model-data approaches could help the Polar and Alpine microbiology community address some of the great research challenges of the 21st century (e.g. assessing global significance and response to climate change). However, a better integration of field and laboratory work with model design and calibration/validation, as well as a stronger focus on quantitative information is required to advance models that can be used to make predictions and upscale processes and fluxes beyond what can be captured by observations alone. © FEMS 2016.
Bridging the divide: a model-data approach to Polar and Alpine microbiology

PubMed Central

Bradley, James A.; Anesio, Alexandre M.; Arndt, Sandra

2016-01-01

Advances in microbial ecology in the cryosphere continue to be driven by empirical approaches including field sampling and laboratory-based analyses. Although mathematical models are commonly used to investigate the physical dynamics of Polar and Alpine regions, they are rarely applied in microbial studies. Yet integrating modelling approaches with ongoing observational and laboratory-based work is ideally suited to Polar and Alpine microbial ecosystems given their harsh environmental and biogeochemical characteristics, simple trophic structures, distinct seasonality, often difficult accessibility, geographical expansiveness and susceptibility to accelerated climate changes. In this opinion paper, we explain how mathematical modelling ideally complements field and laboratory-based analyses. We thus argue that mathematical modelling is a powerful tool for the investigation of these extreme environments and that fully integrated, interdisciplinary model-data approaches could help the Polar and Alpine microbiology community address some of the great research challenges of the 21st century (e.g. assessing global significance and response to climate change). However, a better integration of field and laboratory work with model design and calibration/validation, as well as a stronger focus on quantitative information is required to advance models that can be used to make predictions and upscale processes and fluxes beyond what can be captured by observations alone. PMID:26832206
Brain-Behavior Participant Similarity Networks Among Youth and Emerging Adults with Schizophrenia Spectrum, Autism Spectrum, or Bipolar Disorder and Matched Controls.

PubMed

Stefanik, Laura; Erdman, Lauren; Ameis, Stephanie H; Foussias, George; Mulsant, Benoit H; Behdinan, Tina; Goldenberg, Anna; O'Donnell, Lauren J; Voineskos, Aristotle N

2018-04-01

There is considerable heterogeneity in social cognitive and neurocognitive performance among people with schizophrenia spectrum disorders (SSD), autism spectrum disorders (ASD), bipolar disorder (BD), and healthy individuals. This study used Similarity Network Fusion (SNF), a novel data-driven approach, to identify participant similarity networks based on relationships among demographic, brain imaging, and behavioral data. T1-weighted and diffusion-weighted magnetic resonance images were obtained for 174 adolescents and young adults (aged 16-35 years) with an SSD (n=51), an ASD without intellectual disability (n=38), euthymic BD (n=34), and healthy controls (n=51). A battery of social cognitive and neurocognitive tasks were administered. Data integration, cluster determination, and biological group formation were then obtained using SNF. We identified four new groups of individuals, each with distinct neural circuit-cognitive profiles. The most influential variables driving the formation of the new groups were robustly reliable across embedded resampling techniques. The data-driven groups showed considerably greater differentiation on key social and neurocognitive circuit nodes than groups generated by diagnostic analyses or dimensional social cognitive analyses. The data-driven groups were validated through functional outcome and brain network property measures not included in the SNF model. Cutting across diagnostic boundaries, our approach can effectively identify new groups of people based on a profile of neuroimaging and behavioral data. Our findings bring us closer to disease subtyping that can be leveraged toward the targeting of specific neural circuitry among participant subgroups to ameliorate social cognitive and neurocognitive deficits.
A data-driven weighting scheme for multivariate phenotypic endpoints recapitulates zebrafish developmental cascades

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Guozhu, E-mail: gzhang6@ncsu.edu

Zebrafish have become a key alternative model for studying health effects of environmental stressors, partly due to their genetic similarity to humans, fast generation time, and the efficiency of generating high-dimensional systematic data. Studies aiming to characterize adverse health effects in zebrafish typically include several phenotypic measurements (endpoints). While there is a solid biomedical basis for capturing a comprehensive set of endpoints, making summary judgments regarding health effects requires thoughtful integration across endpoints. Here, we introduce a Bayesian method to quantify the informativeness of 17 distinct zebrafish endpoints as a data-driven weighting scheme for a multi-endpoint summary measure, called weightedmore » Aggregate Entropy (wAggE). We implement wAggE using high-throughput screening (HTS) data from zebrafish exposed to five concentrations of all 1060 ToxCast chemicals. Our results show that our empirical weighting scheme provides better performance in terms of the Receiver Operating Characteristic (ROC) curve for identifying significant morphological effects and improves robustness over traditional curve-fitting approaches. From a biological perspective, our results suggest that developmental cascade effects triggered by chemical exposure can be recapitulated by analyzing the relationships among endpoints. Thus, wAggE offers a powerful approach for analysis of multivariate phenotypes that can reveal underlying etiological processes. - Highlights: • Introduced a data-driven weighting scheme for multiple phenotypic endpoints. • Weighted Aggregate Entropy (wAggE) implies differential importance of endpoints. • Endpoint relationships reveal developmental cascade effects triggered by exposure. • wAggE is generalizable to multi-endpoint data of different shapes and scales.« less
The Role of Guided Induction in Paper-Based Data-Driven Learning

ERIC Educational Resources Information Center

Smart, Jonathan

2014-01-01

This study examines the role of guided induction as an instructional approach in paper-based data-driven learning (DDL) in the context of an ESL grammar course during an intensive English program at an American public university. Specifically, it examines whether corpus-informed grammar instruction is more effective through inductive, data-driven…
Adapting the Biome-BGC Model to New Zealand Pastoral Agriculture: Climate Change and Land-Use Change

NASA Astrophysics Data System (ADS)

Keller, E. D.; Baisden, W. T.; Timar, L.

2011-12-01

We have adapted the Biome-BGC model to make climate change and land-use scenario estimates of New Zealand's pasture production in 2020 and 2050, with comparison to a 2005 baseline. We take an integrated modelling approach with the aim of enabling the model's use for policy assessments across broadly related issues such as climate change mitigation and adaptation, land-use change, and greenhouse gas projections. The Biome-BGC model is a biogeochemical model that simulates carbon, water, and nitrogen cycles in terrestrial ecosystems. We introduce two new 'ecosystems', sheep/beef and dairy pasture, within the existing structure of the Biome-BGC model and calibrate its ecophysiological parameters against pasture clipping data from diverse sites around New Zealand to form a baseline estimate of total New Zealand pasture production. Using downscaled AR4 climate projections, we construct mid- and upper-range climate change scenarios in 2020 and 2050. We produce land-use change scenarios in the same years by combining the Biome-BGC model with the Land Use in Rural New Zealand (LURNZ) model. The LURNZ model uses econometric approaches to predict future land-use change driven by changes in net profits driven by expected pricing, including the introduction of an emission trading system. We estimate the relative change in national pasture production from our 2005 baseline levels for both sheep/beef and dairy systems under each scenario.
An action potential-driven model of soleus muscle activation dynamics for locomotor-like movements

NASA Astrophysics Data System (ADS)

Kim, Hojeong; Sandercock, Thomas G.; Heckman, C. J.

2015-08-01

Objective. The goal of this study was to develop a physiologically plausible, computationally robust model for muscle activation dynamics (A(t)) under physiologically relevant excitation and movement. Approach. The interaction of excitation and movement on A(t) was investigated comparing the force production between a cat soleus muscle and its Hill-type model. For capturing A(t) under excitation and movement variation, a modular modeling framework was proposed comprising of three compartments: (1) spikes-to-[Ca2+]; (2) [Ca2+]-to-A; and (3) A-to-force transformation. The individual signal transformations were modeled based on physiological factors so that the parameter values could be separately determined for individual modules directly based on experimental data. Main results. The strong dependency of A(t) on excitation frequency and muscle length was found during both isometric and dynamically-moving contractions. The identified dependencies of A(t) under the static and dynamic conditions could be incorporated in the modular modeling framework by modulating the model parameters as a function of movement input. The new modeling approach was also applicable to cat soleus muscles producing waveforms independent of those used to set the model parameters. Significance. This study provides a modeling framework for spike-driven muscle responses during movement, that is suitable not only for insights into molecular mechanisms underlying muscle behaviors but also for large scale simulations.
NextGEOSS project: A user-driven approach to build a Earth Observations Data Hub

NASA Astrophysics Data System (ADS)

Percivall, G.; Voidrot, M. F.; Bye, B. L.; De Lathouwer, B.; Catarino, N.; Concalves, P.; Kraft, C.; Grosso, N.; Meyer-Arnek, J.; Mueller, A.; Goor, E.

2017-12-01

Several initiatives and projects contribute to support Group on Earth Observation's (GEO) global priorities including support to the UN 2030 Agenda for sustainable development, the Paris Agreement on climate change, and the Sendai Framework for Disaster Risk Reduction . Running until 2020, the NextGEOSS project evolves the European vision of a user driven GEOSS data exploitation for innovation and business, relying on the three main pillars: engaging communities of practice delivering technological advancements advocating the use of GEOSS These 3 pillars support the creation and deployment of Earth observation based innovative research activities and commercial services. In this presentation we will emphasise how the NextGEOSS project uses a pilot-driven approach to ramp up and consolidate the system in a pragmatique way, integrating the complexity of the existing global ecosystem, leveraging previous investments, adding new cloud technologies and resources and engaging the diverse communities to address all types of Sustainable Development Goals (SDGs). A set of 10 initial pilots have been defined by the project partners to address the main challenges and include as soon as possible contributions to SDGs associated with Food Sustainability, Bio Diversity, Space and Security, Cold Regions, Air Pollutions, Disaster Risk Reduction, Territorial Planning, Energy. In 2018 and 2019 the project team will work on two new series of Architecture Implementation Pilots (AIP-10 and AIP-11), opened world-wide, to increase discoverability, accessibility and usability of data with a strong User Centric approach for innovative GEOSS powered applications for multiple societal areas. All initiatives with an interest in and need of Earth observations (data, processes, models, ...) are welcome to participate to these pilots initiatives. NextGEOSS is a H2020 Research and Development Project from the European Community under grant agreement 730329.
Data-Driven Approaches to Empirical Discovery

DTIC Science & Technology

1988-10-31

if nece ry and identify by block number) empirical discovery history of science data-driven heuristics numeric laws theoretical terms scope of laws...to the normative side. Machine Discovery and the History of Science The history of science studies the actual path followed by scientists over the
A Meta-Data Driven Approach to Searching for Educational Resources in a Global Context.

ERIC Educational Resources Information Center

Wade, Vincent P.; Doherty, Paul

This paper presents the design of an Internet-enabled search service that supports educational resource discovery within an educational brokerage service. More specifically, it presents the design and implementation of a metadata-driven approach to implementing the distributed search and retrieval of Internet-based educational resources and…
Evaluating Model-Driven Development for large-scale EHRs through the openEHR approach.

PubMed

Christensen, Bente; Ellingsen, Gunnar

2016-05-01

In healthcare, the openEHR standard is a promising Model-Driven Development (MDD) approach for electronic healthcare records. This paper aims to identify key socio-technical challenges when the openEHR approach is put to use in Norwegian hospitals. More specifically, key fundamental assumptions are investigated empirically. These assumptions promise a clear separation of technical and domain concerns, users being in control of the modelling process, and widespread user commitment. Finally, these assumptions promise an easy way to model and map complex organizations. This longitudinal case study is based on an interpretive approach, whereby data were gathered through 440h of participant observation, 22 semi-structured interviews and extensive document studies over 4 years. The separation of clinical and technical concerns seemed to be aspirational, because both designing the technical system and modelling the domain required technical and clinical competence. Hence developers and clinicians found themselves working together in both arenas. User control and user commitment seemed not to apply in large-scale projects, as modelling the domain turned out to be too complicated and hence to appeal only to especially interested users worldwide, not the local end-users. Modelling proved to be a complex standardization process that shaped both the actual modelling and healthcare practice itself. A broad assemblage of contributors seems to be needed for developing an archetype-based system, in which roles, responsibilities and contributions cannot be clearly defined and delimited. The way MDD occurs has implications for medical practice per se in the form of the need to standardize practices to ensure that medical concepts are uniform across practices. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Quantifying Cancer Risk from Radiation.

PubMed

Keil, Alexander P; Richardson, David B

2017-12-06

Complex statistical models fitted to data from studies of atomic bomb survivors are used to estimate the human health effects of ionizing radiation exposures. We describe and illustrate an approach to estimate population risks from ionizing radiation exposure that relaxes many assumptions about radiation-related mortality. The approach draws on developments in methods for causal inference. The results offer a different way to quantify radiation's effects and show that conventional estimates of the population burden of excess cancer at high radiation doses are driven strongly by projecting outside the range of current data. Summary results obtained using the proposed approach are similar in magnitude to those obtained using conventional methods, although estimates of radiation-related excess cancers differ for many age, sex, and dose groups. At low doses relevant to typical exposures, the strength of evidence in data is surprisingly weak. Statements regarding human health effects at low doses rely strongly on the use of modeling assumptions. © 2017 Society for Risk Analysis.
Modeling Quasi-Static and Fatigue-Driven Delamination Migration

NASA Technical Reports Server (NTRS)

De Carvalho, N. V.; Ratcliffe, J. G.; Chen, B. Y.; Pinho, S. T.; Baiz, P. M.; Tay, T. E.

2014-01-01

An approach was proposed and assessed for the high-fidelity modeling of progressive damage and failure in composite materials. It combines the Floating Node Method (FNM) and the Virtual Crack Closure Technique (VCCT) to represent multiple interacting failure mechanisms in a mesh-independent fashion. Delamination, matrix cracking, and migration were captured failure and migration criteria based on fracture mechanics. Quasi-static and fatigue loading were modeled within the same overall framework. The methodology proposed was illustrated by simulating the delamination migration test, showing good agreement with the available experimental data.
Closed-form solutions in stress-driven two-phase integral elasticity for bending of functionally graded nano-beams

NASA Astrophysics Data System (ADS)

Barretta, Raffaele; Fabbrocino, Francesco; Luciano, Raimondo; Sciarra, Francesco Marotti de

2018-03-01

Strain-driven and stress-driven integral elasticity models are formulated for the analysis of the structural behaviour of fuctionally graded nano-beams. An innovative stress-driven two-phases constitutive mixture defined by a convex combination of local and nonlocal phases is presented. The analysis reveals that the Eringen strain-driven fully nonlocal model cannot be used in Structural Mechanics since it is ill-posed and the local-nonlocal mixtures based on the Eringen integral model partially resolve the ill-posedeness of the model. In fact, a singular behaviour of continuous nano-structures appears if the local fraction tends to vanish so that the ill-posedness of the Eringen integral model is not eliminated. On the contrary, local-nonlocal mixtures based on the stress-driven theory are mathematically and mechanically appropriate for nanosystems. Exact solutions of inflected functionally graded nanobeams of technical interest are established by adopting the new local-nonlocal mixture stress-driven integral relation. Effectiveness of the new nonlocal approach is tested by comparing the contributed results with the ones corresponding to the mixture Eringen theory.
Using machine learning to identify air pollution exposure profiles associated with early cognitive skills among U.S. children.

PubMed

Stingone, Jeanette A; Pandey, Om P; Claudio, Luz; Pandey, Gaurav

2017-11-01

Data-driven machine learning methods present an opportunity to simultaneously assess the impact of multiple air pollutants on health outcomes. The goal of this study was to apply a two-stage, data-driven approach to identify associations between air pollutant exposure profiles and children's cognitive skills. Data from 6900 children enrolled in the Early Childhood Longitudinal Study, Birth Cohort, a national study of children born in 2001 and followed through kindergarten, were linked to estimated concentrations of 104 ambient air toxics in the 2002 National Air Toxics Assessment using ZIP code of residence at age 9 months. In the first-stage, 100 regression trees were learned to identify ambient air pollutant exposure profiles most closely associated with scores on a standardized mathematics test administered to children in kindergarten. In the second-stage, the exposure profiles frequently predicting lower math scores were included within linear regression models and adjusted for confounders in order to estimate the magnitude of their effect on math scores. This approach was applied to the full population, and then to the populations living in urban and highly-populated urban areas. Our first-stage results in the full population suggested children with low trichloroethylene exposure had significantly lower math scores. This association was not observed for children living in urban communities, suggesting that confounding related to urbanicity needs to be considered within the first-stage. When restricting our analysis to populations living in urban and highly-populated urban areas, high isophorone levels were found to predict lower math scores. Within adjusted regression models of children in highly-populated urban areas, the estimated effect of higher isophorone exposure on math scores was -1.19 points (95% CI -1.94, -0.44). Similar results were observed for the overall population of urban children. This data-driven, two-stage approach can be applied to other populations, exposures and outcomes to generate hypotheses within high-dimensional exposure data. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Optimizing methods for linking cinematic features to fMRI data.

PubMed

Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia

2015-04-15

One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
The ``Missing Compounds'' affair in functionality-driven material discovery

NASA Astrophysics Data System (ADS)

Zunger, Alex

2014-03-01

In the paradigm of ``data-driven discovery,'' underlying one of the leading streams of the Material Genome Initiative (MGI), one attempts to compute high-throughput style as many of the properties of as many of the N (about 10**5- 10**6) compounds listed in databases of previously known compounds. One then inspects the ensuing Big Data, searching for useful trends. The alternative and complimentary paradigm of ``functionality-directed search and optimization'' used here, searches instead for the n much smaller than N configurations and compositions that have the desired value of the target functionality. Examples include the use of genetic and other search methods that optimize the structure or identity of atoms on lattice sites, using atomistic electronic structure (such as first-principles) approaches in search of a given electronic property. This addresses a few of the bottlenecks that have faced the alternative, data-driven/high throughput/Big Data philosophy: (i) When the configuration space is theoretically of infinite size, building a complete data base as in data-driven discovery is impossible, yet searching for the optimum functionality, is still a well-posed problem. (ii) The configuration space that we explore might include artificially grown, kinetically stabilized systems (such as 2D layer stacks; superlattices; colloidal nanostructures; Fullerenes) that are not listed in compound databases (used by data-driven approaches), (iii) a large fraction of chemically plausible compounds have not been experimentally synthesized, so in the data-driven approach these are often skipped. In our approach we search explicitly for such ``Missing Compounds''. It is likely that many interesting material properties will be found in cases (i)-(iii) that elude high throughput searches based on databases encapsulating existing knowledge. I will illustrate (a) Functionality-driven discovery of topological insulators and valley-split quantum-computer semiconductors, as well as (b) Use of ``first principles thermodynamics'' to discern which of the previously ``missing compounds'' should, in fact exist and in which structure. Synthesis efforts by Poeppelmeier group at NU realized 20 never-before-made half-Heusler compounds out of the 20 predicted ones, in our predicted space groups. This type of theory-led experimental search of designed materials with target functionalities may shorten the current process of discovery of interesting functional materials. Supported by DOE ,Office of Science, Energy Frontier Research Center for Inverse Design

Phase field model of fluid-driven fracture in elastic media: Immersed-fracture formulation and validation with analytical solutions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santillán, David; Juanes, Ruben; Cueto-Felgueroso, Luis

Propagation of fluid-driven fractures plays an important role in natural and engineering processes, including transport of magma in the lithosphere, geologic sequestration of carbon dioxide, and oil and gas recovery from low-permeability formations, among many others. The simulation of fracture propagation poses a computational challenge as a result of the complex physics of fracture and the need to capture disparate length scales. Phase field models represent fractures as a diffuse interface and enjoy the advantage that fracture nucleation, propagation, branching, or twisting can be simulated without ad hoc computational strategies like remeshing or local enrichment of the solution space. Heremore » we propose a new quasi-static phase field formulation for modeling fluid-driven fracturing in elastic media at small strains. The approach fully couples the fluid flow in the fracture (described via the Reynolds lubrication approximation) and the deformation of the surrounding medium. The flow is solved on a lower dimensionality mesh immersed in the elastic medium. This approach leads to accurate coupling of both physics. We assessed the performance of the model extensively by comparing results for the evolution of fracture length, aperture, and fracture fluid pressure against analytical solutions under different fracture propagation regimes. Thus, the excellent performance of the numerical model in all regimes builds confidence in the applicability of phase field approaches to simulate fluid-driven fracture.« less
Phase field model of fluid-driven fracture in elastic media: Immersed-fracture formulation and validation with analytical solutions

DOE PAGES

Santillán, David; Juanes, Ruben; Cueto-Felgueroso, Luis

2017-04-20

Propagation of fluid-driven fractures plays an important role in natural and engineering processes, including transport of magma in the lithosphere, geologic sequestration of carbon dioxide, and oil and gas recovery from low-permeability formations, among many others. The simulation of fracture propagation poses a computational challenge as a result of the complex physics of fracture and the need to capture disparate length scales. Phase field models represent fractures as a diffuse interface and enjoy the advantage that fracture nucleation, propagation, branching, or twisting can be simulated without ad hoc computational strategies like remeshing or local enrichment of the solution space. Heremore » we propose a new quasi-static phase field formulation for modeling fluid-driven fracturing in elastic media at small strains. The approach fully couples the fluid flow in the fracture (described via the Reynolds lubrication approximation) and the deformation of the surrounding medium. The flow is solved on a lower dimensionality mesh immersed in the elastic medium. This approach leads to accurate coupling of both physics. We assessed the performance of the model extensively by comparing results for the evolution of fracture length, aperture, and fracture fluid pressure against analytical solutions under different fracture propagation regimes. Thus, the excellent performance of the numerical model in all regimes builds confidence in the applicability of phase field approaches to simulate fluid-driven fracture.« less
Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

NASA Astrophysics Data System (ADS)

Wang, Lijuan; Yan, Yong; Wang, Xue; Wang, Tao

2017-03-01

Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including partial mutual information (PMI), genetic algorithm-artificial neural network (GA-ANN) and tree-based iterative input selection (IIS) are applied in this study. Typical data-driven models incorporating support vector machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction.
Towards a Job Demands-Resources Health Model: Empirical Testing with Generalizable Indicators of Job Demands, Job Resources, and Comprehensive Health Outcomes.

PubMed

Brauchli, Rebecca; Jenny, Gregor J; Füllemann, Désirée; Bauer, Georg F

2015-01-01

Studies using the Job Demands-Resources (JD-R) model commonly have a heterogeneous focus concerning the variables they investigate-selective job demands and resources as well as burnout and work engagement. The present study applies the rationale of the JD-R model to expand the relevant outcomes of job demands and job resources by linking the JD-R model to the logic of a generic health development framework predicting more broadly positive and negative health. The resulting JD-R health model was operationalized and tested with a generalizable set of job characteristics and positive and negative health outcomes among a heterogeneous sample of 2,159 employees. Applying a theory-driven and a data-driven approach, measures which were generally relevant for all employees were selected. Results from structural equation modeling indicated that the model fitted the data. Multiple group analyses indicated invariance across six organizations, gender, job positions, and three times of measurement. Initial evidence was found for the validity of an expanded JD-R health model. Thereby this study contributes to the current research on job characteristics and health by combining the core idea of the JD-R model with the broader concepts of salutogenic and pathogenic health development processes as well as both positive and negative health outcomes.
A Team Approach to Data-Driven Decision-Making Literacy Instruction in Preschool Classrooms: Child Assessment and Intervention through Classroom Team Self-Reflection

ERIC Educational Resources Information Center

Abbott, Mary; Beecher, Constance; Petersen, Sarah; Greenwood, Charles R.; Atwater, Jane

2017-01-01

Many schools around the country are getting positive responses implementing Response to Intervention (RTI) within a Multi-Tiered System of Support (MTSS) framework (e.g., Abbott, 2011; Ball & Trammell, 2011; Buysee & Peisner-Feinberg, 2009). RTI refers to an instructional model that is based on a student's response to instruction. RTI…
A Cohesive Zone Approach for Fatigue-Driven Delamination Analysis in Composite Materials

NASA Astrophysics Data System (ADS)

Amiri-Rad, Ahmad; Mashayekhi, Mohammad

2017-08-01

A new model for prediction of fatigue-driven delamination in laminated composites is proposed using cohesive interface elements. The presented model provides a link between cohesive elements damage evolution rate and crack growth rate of Paris law. This is beneficial since no additional material parameters are required and the well-known Paris law constants are used. The link between the cohesive zone method and fracture mechanics is achieved without use of effective length which has led to more accurate results. The problem of unknown failure path in calculation of the energy release rate is solved by imposing a condition on the damage model which leads to completely vertical failure path. A global measure of energy release rate is used for the whole cohesive zone which is computationally more efficient compared to previous similar models. The performance of the proposed model is investigated by simulation of well-known delamination tests and comparison against experimental data of the literature.
Data-driven medicinal chemistry in the era of big data.

PubMed

Lusher, Scott J; McGuire, Ross; van Schaik, René C; Nicholson, C David; de Vlieg, Jacob

2014-07-01

Science, and the way we undertake research, is changing. The increasing rate of data generation across all scientific disciplines is providing incredible opportunities for data-driven research, with the potential to transform our current practices. The exploitation of so-called 'big data' will enable us to undertake research projects never previously possible but should also stimulate a re-evaluation of all our data practices. Data-driven medicinal chemistry approaches have the potential to improve decision making in drug discovery projects, providing that all researchers embrace the role of 'data scientist' and uncover the meaningful relationships and patterns in available data. Copyright © 2013 Elsevier Ltd. All rights reserved.
Documentation Driven Development for Complex Real-Time Systems

DTIC Science & Technology

2004-12-01

This paper presents a novel approach for development of complex real - time systems , called the documentation-driven development (DDD) approach. This... time systems . DDD will also support automated software generation based on a computational model and some relevant techniques. DDD includes two main...stakeholders to be easily involved in development processes and, therefore, significantly improve the agility of software development for complex real
Driven by Data: How Three Districts Are Successfully Using Data, Rather than Gut Feelings, to Align Staff Development with School Needs

ERIC Educational Resources Information Center

Gold, Stephanie

2005-01-01

The concept of data-driven professional development is both straight-forward and sensible. Implementing this approach is another story, which is why many administrators are turning to sophisticated tools to help manage data collection and analysis. These tools allow educators to assess and correlate student outcomes, instructional methods, and…
Model driven mobile care for patients with type 1 diabetes.

PubMed

Skrøvseth, Stein Olav; Arsand, Eirik; Godtliebsen, Fred; Joakimsen, Ragnar M

2012-01-01

We gathered a data set from 30 patients with type 1 diabetes by giving the patients a mobile phone application, where they recorded blood glucose measurements, insulin injections, meals, and physical activity. Using these data as a learning data set, we describe a new approach of building a mobile feedback system for these patients based on periodicities, pattern recognition, and scale-space trends. Most patients have important patterns for periodicities and trends, though better resolution of input variables is needed to provide useful feedback using pattern recognition.
Toward a Mixed-Methods Research Approach to Content Analysis in The Digital Age: The Combined Content-Analysis Model and its Applications to Health Care Twitter Feeds.

PubMed

Hamad, Eradah O; Savundranayagam, Marie Y; Holmes, Jeffrey D; Kinsella, Elizabeth Anne; Johnson, Andrew M

2016-03-08

Twitter's 140-character microblog posts are increasingly used to access information and facilitate discussions among health care professionals and between patients with chronic conditions and their caregivers. Recently, efforts have emerged to investigate the content of health care-related posts on Twitter. This marks a new area for researchers to investigate and apply content analysis (CA). In current infodemiology, infoveillance and digital disease detection research initiatives, quantitative and qualitative Twitter data are often combined, and there are no clear guidelines for researchers to follow when collecting and evaluating Twitter-driven content. The aim of this study was to identify studies on health care and social media that used Twitter feeds as a primary data source and CA as an analysis technique. We evaluated the resulting 18 studies based on a narrative review of previous methodological studies and textbooks to determine the criteria and main features of quantitative and qualitative CA. We then used the key features of CA and mixed-methods research designs to propose the combined content-analysis (CCA) model as a solid research framework for designing, conducting, and evaluating investigations of Twitter-driven content. We conducted a PubMed search to collect studies published between 2010 and 2014 that used CA to analyze health care-related tweets. The PubMed search and reference list checks of selected papers identified 21 papers. We excluded 3 papers and further analyzed 18. Results suggest that the methods used in these studies were not purely quantitative or qualitative, and the mixed-methods design was not explicitly chosen for data collection and analysis. A solid research framework is needed for researchers who intend to analyze Twitter data through the use of CA. We propose the CCA model as a useful framework that provides a straightforward approach to guide Twitter-driven studies and that adds rigor to health care social media investigations. We provide suggestions for the use of the CCA model in elder care-related contexts.
Toward a Mixed-Methods Research Approach to Content Analysis in The Digital Age: The Combined Content-Analysis Model and its Applications to Health Care Twitter Feeds

PubMed Central

Hamad, Eradah O; Savundranayagam, Marie Y; Holmes, Jeffrey D; Kinsella, Elizabeth Anne

2016-01-01

Background Twitter’s 140-character microblog posts are increasingly used to access information and facilitate discussions among health care professionals and between patients with chronic conditions and their caregivers. Recently, efforts have emerged to investigate the content of health care-related posts on Twitter. This marks a new area for researchers to investigate and apply content analysis (CA). In current infodemiology, infoveillance and digital disease detection research initiatives, quantitative and qualitative Twitter data are often combined, and there are no clear guidelines for researchers to follow when collecting and evaluating Twitter-driven content. Objective The aim of this study was to identify studies on health care and social media that used Twitter feeds as a primary data source and CA as an analysis technique. We evaluated the resulting 18 studies based on a narrative review of previous methodological studies and textbooks to determine the criteria and main features of quantitative and qualitative CA. We then used the key features of CA and mixed-methods research designs to propose the combined content-analysis (CCA) model as a solid research framework for designing, conducting, and evaluating investigations of Twitter-driven content. Methods We conducted a PubMed search to collect studies published between 2010 and 2014 that used CA to analyze health care-related tweets. The PubMed search and reference list checks of selected papers identified 21 papers. We excluded 3 papers and further analyzed 18. Results Results suggest that the methods used in these studies were not purely quantitative or qualitative, and the mixed-methods design was not explicitly chosen for data collection and analysis. A solid research framework is needed for researchers who intend to analyze Twitter data through the use of CA. Conclusions We propose the CCA model as a useful framework that provides a straightforward approach to guide Twitter-driven studies and that adds rigor to health care social media investigations. We provide suggestions for the use of the CCA model in elder care-related contexts. PMID:26957477
Modeling and executing electronic health records driven phenotyping algorithms using the NQF Quality Data Model and JBoss® Drools Engine.

PubMed

Li, Dingcheng; Endle, Cory M; Murthy, Sahana; Stancl, Craig; Suesse, Dale; Sottara, Davide; Huff, Stanley M; Chute, Christopher G; Pathak, Jyotishman

2012-01-01

With increasing adoption of electronic health records (EHRs), the need for formal representations for EHR-driven phenotyping algorithms has been recognized for some time. The recently proposed Quality Data Model from the National Quality Forum (NQF) provides an information model and a grammar that is intended to represent data collected during routine clinical care in EHRs as well as the basic logic required to represent the algorithmic criteria for phenotype definitions. The QDM is further aligned with Meaningful Use standards to ensure that the clinical data and algorithmic criteria are represented in a consistent, unambiguous and reproducible manner. However, phenotype definitions represented in QDM, while structured, cannot be executed readily on existing EHRs. Rather, human interpretation, and subsequent implementation is a required step for this process. To address this need, the current study investigates open-source JBoss® Drools rules engine for automatic translation of QDM criteria into rules for execution over EHR data. In particular, using Apache Foundation's Unstructured Information Management Architecture (UIMA) platform, we developed a translator tool for converting QDM defined phenotyping algorithm criteria into executable Drools rules scripts, and demonstrated their execution on real patient data from Mayo Clinic to identify cases for Coronary Artery Disease and Diabetes. To the best of our knowledge, this is the first study illustrating a framework and an approach for executing phenotyping criteria modeled in QDM using the Drools business rules management system.
Modeling and Executing Electronic Health Records Driven Phenotyping Algorithms using the NQF Quality Data Model and JBoss® Drools Engine

PubMed Central

Li, Dingcheng; Endle, Cory M; Murthy, Sahana; Stancl, Craig; Suesse, Dale; Sottara, Davide; Huff, Stanley M.; Chute, Christopher G.; Pathak, Jyotishman

2012-01-01

With increasing adoption of electronic health records (EHRs), the need for formal representations for EHR-driven phenotyping algorithms has been recognized for some time. The recently proposed Quality Data Model from the National Quality Forum (NQF) provides an information model and a grammar that is intended to represent data collected during routine clinical care in EHRs as well as the basic logic required to represent the algorithmic criteria for phenotype definitions. The QDM is further aligned with Meaningful Use standards to ensure that the clinical data and algorithmic criteria are represented in a consistent, unambiguous and reproducible manner. However, phenotype definitions represented in QDM, while structured, cannot be executed readily on existing EHRs. Rather, human interpretation, and subsequent implementation is a required step for this process. To address this need, the current study investigates open-source JBoss® Drools rules engine for automatic translation of QDM criteria into rules for execution over EHR data. In particular, using Apache Foundation’s Unstructured Information Management Architecture (UIMA) platform, we developed a translator tool for converting QDM defined phenotyping algorithm criteria into executable Drools rules scripts, and demonstrated their execution on real patient data from Mayo Clinic to identify cases for Coronary Artery Disease and Diabetes. To the best of our knowledge, this is the first study illustrating a framework and an approach for executing phenotyping criteria modeled in QDM using the Drools business rules management system. PMID:23304325
Mathematical modeling and computational prediction of cancer drug resistance.

PubMed

Sun, Xiaoqiang; Hu, Bin

2017-06-23

Diverse forms of resistance to anticancer drugs can lead to the failure of chemotherapy. Drug resistance is one of the most intractable issues for successfully treating cancer in current clinical practice. Effective clinical approaches that could counter drug resistance by restoring the sensitivity of tumors to the targeted agents are urgently needed. As numerous experimental results on resistance mechanisms have been obtained and a mass of high-throughput data has been accumulated, mathematical modeling and computational predictions using systematic and quantitative approaches have become increasingly important, as they can potentially provide deeper insights into resistance mechanisms, generate novel hypotheses or suggest promising treatment strategies for future testing. In this review, we first briefly summarize the current progress of experimentally revealed resistance mechanisms of targeted therapy, including genetic mechanisms, epigenetic mechanisms, posttranslational mechanisms, cellular mechanisms, microenvironmental mechanisms and pharmacokinetic mechanisms. Subsequently, we list several currently available databases and Web-based tools related to drug sensitivity and resistance. Then, we focus primarily on introducing some state-of-the-art computational methods used in drug resistance studies, including mechanism-based mathematical modeling approaches (e.g. molecular dynamics simulation, kinetic model of molecular networks, ordinary differential equation model of cellular dynamics, stochastic model, partial differential equation model, agent-based model, pharmacokinetic-pharmacodynamic model, etc.) and data-driven prediction methods (e.g. omics data-based conventional screening approach for node biomarkers, static network approach for edge biomarkers and module biomarkers, dynamic network approach for dynamic network biomarkers and dynamic module network biomarkers, etc.). Finally, we discuss several further questions and future directions for the use of computational methods for studying drug resistance, including inferring drug-induced signaling networks, multiscale modeling, drug combinations and precision medicine. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Closed-loop suppression of chaos in nonlinear driven oscillators

NASA Astrophysics Data System (ADS)

Aguirre, L. A.; Billings, S. A.

1995-05-01

This paper discusses the suppression of chaos in nonlinear driven oscillators via the addition of a periodic perturbation. Given a system originally undergoing chaotic motions, it is desired that such a system be driven to some periodic orbit. This can be achieved by the addition of a weak periodic signal to the oscillator input. This is usually accomplished in open loop, but this procedure presents some difficulties which are discussed in the paper. To ensure that this is attained despite uncertainties and possible disturbances on the system, a procedure is suggested to perform control in closed loop. In addition, it is illustrated how a model, estimated from input/output data, can be used in the design. Numerical examples which use the Duffing-Ueda and modified van der Pol oscillators are included to illustrate some of the properties of the new approach.
Using a Time-Driven Activity-Based Costing Model To Determine the Actual Cost of Services Provided by a Transgenic Core.

PubMed

Gerwin, Philip M; Norinsky, Rada M; Tolwani, Ravi J

2018-03-01

Laboratory animal programs and core laboratories often set service rates based on cost estimates. However, actual costs may be unknown, and service rates may not reflect the actual cost of services. Accurately evaluating the actual costs of services can be challenging and time-consuming. We used a time-driven activity-based costing (ABC) model to determine the cost of services provided by a resource laboratory at our institution. The time-driven approach is a more efficient approach to calculating costs than using a traditional ABC model. We calculated only 2 parameters: the time required to perform an activity and the unit cost of the activity based on employee cost. This method allowed us to rapidly and accurately calculate the actual cost of services provided, including microinjection of a DNA construct, microinjection of embryonic stem cells, embryo transfer, and in vitro fertilization. We successfully implemented a time-driven ABC model to evaluate the cost of these services and the capacity of labor used to deliver them. We determined how actual costs compared with current service rates. In addition, we determined that the labor supplied to conduct all services (10,645 min/wk) exceeded the practical labor capacity (8400 min/wk), indicating that the laboratory team was highly efficient and that additional labor capacity was needed to prevent overloading of the current team. Importantly, this time-driven ABC approach allowed us to establish a baseline model that can easily be updated to reflect operational changes or changes in labor costs. We demonstrated that a time-driven ABC model is a powerful management tool that can be applied to other core facilities as well as to entire animal programs, providing valuable information that can be used to set rates based on the actual cost of services and to improve operating efficiency.
Systems Biology-Driven Hypotheses Tested In Vivo: The Need to Advancing Molecular Imaging Tools.

PubMed

Verma, Garima; Palombo, Alessandro; Grigioni, Mauro; La Monaca, Morena; D'Avenio, Giuseppe

2018-01-01

Processing and interpretation of biological images may provide invaluable insights on complex, living systems because images capture the overall dynamics as a "whole." Therefore, "extraction" of key, quantitative morphological parameters could be, at least in principle, helpful in building a reliable systems biology approach in understanding living objects. Molecular imaging tools for system biology models have attained widespread usage in modern experimental laboratories. Here, we provide an overview on advances in the computational technology and different instrumentations focused on molecular image processing and analysis. Quantitative data analysis through various open source software and algorithmic protocols will provide a novel approach for modeling the experimental research program. Besides this, we also highlight the predictable future trends regarding methods for automatically analyzing biological data. Such tools will be very useful to understand the detailed biological and mathematical expressions under in-silico system biology processes with modeling properties.
Data-driven mapping of hypoxia-related tumor heterogeneity using DCE-MRI and OE-MRI.

PubMed

Featherstone, Adam K; O'Connor, James P B; Little, Ross A; Watson, Yvonne; Cheung, Sue; Babur, Muhammad; Williams, Kaye J; Matthews, Julian C; Parker, Geoff J M

2018-04-01

Previous work has shown that combining dynamic contrast-enhanced (DCE)-MRI and oxygen-enhanced (OE)-MRI binary enhancement maps can identify tumor hypoxia. The current work proposes a novel, data-driven method for mapping tissue oxygenation and perfusion heterogeneity, based on clustering DCE/OE-MRI data. DCE-MRI and OE-MRI were performed on nine U87 (glioblastoma) and seven Calu6 (non-small cell lung cancer) murine xenograft tumors. Area under the curve and principal component analysis features were calculated and clustered separately using Gaussian mixture modelling. Evaluation metrics were calculated to determine the optimum feature set and cluster number. Outputs were quantitatively compared with a previous non data-driven approach. The optimum method located six robustly identifiable clusters in the data, yielding tumor region maps with spatially contiguous regions in a rim-core structure, suggesting a biological basis. Mean within-cluster enhancement curves showed physiologically distinct, intuitive kinetics of enhancement. Regions of DCE/OE-MRI enhancement mismatch were located, and voxel categorization agreed well with the previous non data-driven approach (Cohen's kappa = 0.61, proportional agreement = 0.75). The proposed method locates similar regions to the previous published method of binarization of DCE/OE-MRI enhancement, but renders a finer segmentation of intra-tumoral oxygenation and perfusion. This could aid in understanding the tumor microenvironment and its heterogeneity. Magn Reson Med 79:2236-2245, 2018. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2017 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine.
Enabling Data-Driven Methodologies Across the Data Lifecycle and Ecosystem

NASA Astrophysics Data System (ADS)

Doyle, R. J.; Crichton, D.

2017-12-01

NASA has unlocked unprecedented scientific knowledge through exploration of the Earth, our solar system, and the larger universe. NASA is generating enormous amounts of data that are challenging traditional approaches to capturing, managing, analyzing and ultimately gaining scientific understanding from science data. New architectures, capabilities and methodologies are needed to span the entire observing system, from spacecraft to archive, while integrating data-driven discovery and analytic capabilities. NASA data have a definable lifecycle, from remote collection point to validated accessibility in multiple archives. Data challenges must be addressed across this lifecycle, to capture opportunities and avoid decisions that may limit or compromise what is achievable once data arrives at the archive. Data triage may be necessary when the collection capacity of the sensor or instrument overwhelms data transport or storage capacity. By migrating computational and analytic capability to the point of data collection, informed decisions can be made about which data to keep; in some cases, to close observational decision loops onboard, to enable attending to unexpected or transient phenomena. Along a different dimension than the data lifecycle, scientists and other end-users must work across an increasingly complex data ecosystem, where the range of relevant data is rarely owned by a single institution. To operate effectively, scalable data architectures and community-owned information models become essential. NASA's Planetary Data System is having success with this approach. Finally, there is the difficult challenge of reproducibility and trust. While data provenance techniques will be part of the solution, future interactive analytics environments must support an ability to provide a basis for a result: relevant data source and algorithms, uncertainty tracking, etc., to assure scientific integrity and to enable confident decision making. Advances in data science offer opportunities to gain new insights from space missions and their vast data collections. We are working to innovate new architectures, exploit emerging technologies, develop new data-driven methodologies, and transfer them across disciplines, while working across the dual dimensions of the data lifecycle and the data ecosystem.

Performance Analysis of a Ring Current Model Driven by Global MHD

NASA Astrophysics Data System (ADS)

Falasca, A.; Keller, K. A.; Fok, M.; Hesse, M.; Gombosi, T.

2003-12-01

Effectively modeling the high-energy particles in Earth's inner magnetosphere has the potential to improve safety in both manned and unmanned spacecraft. One model of this environment is the Fok Ring Current Model. This model can utilize as inputs both solar wind data, and empirical ionospheric electric field and magnetic field models. Alternatively, we have a procedure which allows the model to be driven by outputs from the BATS-R-US global MHD model. By using in-situ satellite data we will compare the predictive capability of this model in its original stand-alone form, to that of the model when driven by the BATS-R-US Global Magnetosphere Model. As a basis for comparison we use the April 2002 and May 2003 storms where suitable LANL geosynchronous data are available.
Relational machine learning for electronic health record-driven phenotyping.

PubMed

Peissig, Peggy L; Santos Costa, Vitor; Caldwell, Michael D; Rottscheit, Carla; Berg, Richard L; Mendonca, Eneida A; Page, David

2014-12-01

Electronic health records (EHR) offer medical and pharmacogenomics research unprecedented opportunities to identify and classify patients at risk. EHRs are collections of highly inter-dependent records that include biological, anatomical, physiological, and behavioral observations. They comprise a patient's clinical phenome, where each patient has thousands of date-stamped records distributed across many relational tables. Development of EHR computer-based phenotyping algorithms require time and medical insight from clinical experts, who most often can only review a small patient subset representative of the total EHR records, to identify phenotype features. In this research we evaluate whether relational machine learning (ML) using inductive logic programming (ILP) can contribute to addressing these issues as a viable approach for EHR-based phenotyping. Two relational learning ILP approaches and three well-known WEKA (Waikato Environment for Knowledge Analysis) implementations of non-relational approaches (PART, J48, and JRIP) were used to develop models for nine phenotypes. International Classification of Diseases, Ninth Revision (ICD-9) coded EHR data were used to select training cohorts for the development of each phenotypic model. Accuracy, precision, recall, F-Measure, and Area Under the Receiver Operating Characteristic (AUROC) curve statistics were measured for each phenotypic model based on independent manually verified test cohorts. A two-sided binomial distribution test (sign test) compared the five ML approaches across phenotypes for statistical significance. We developed an approach to automatically label training examples using ICD-9 diagnosis codes for the ML approaches being evaluated. Nine phenotypic models for each ML approach were evaluated, resulting in better overall model performance in AUROC using ILP when compared to PART (p=0.039), J48 (p=0.003) and JRIP (p=0.003). ILP has the potential to improve phenotyping by independently delivering clinically expert interpretable rules for phenotype definitions, or intuitive phenotypes to assist experts. Relational learning using ILP offers a viable approach to EHR-driven phenotyping. Copyright © 2014 Elsevier Inc. All rights reserved.
Mining residential water and electricity demand data in Southern California to inform demand management strategies

NASA Astrophysics Data System (ADS)

Cominola, A.; Spang, E. S.; Giuliani, M.; Castelletti, A.; Loge, F. J.; Lund, J. R.

2016-12-01

Demand side management strategies are key to meet future water and energy demands in urban contexts, promote water and energy efficiency in the residential sector, provide customized services and communications to consumers, and reduce utilities' costs. Smart metering technologies allow gathering high temporal and spatial resolution water and energy consumption data and support the development of data-driven models of consumers' behavior. Modelling and predicting resource consumption behavior is essential to inform demand management. Yet, analyzing big, smart metered, databases requires proper data mining and modelling techniques, in order to extract useful information supporting decision makers to spot end uses towards which water and energy efficiency or conservation efforts should be prioritized. In this study, we consider the following research questions: (i) how is it possible to extract representative consumers' personalities out of big smart metered water and energy data? (ii) are residential water and energy consumption profiles interconnected? (iii) Can we design customized water and energy demand management strategies based on the knowledge of water- energy demand profiles and other user-specific psychographic information? To address the above research questions, we contribute a data-driven approach to identify and model routines in water and energy consumers' behavior. We propose a novel customer segmentation procedure based on data-mining techniques. Our procedure consists of three steps: (i) extraction of typical water-energy consumption profiles for each household, (ii) profiles clustering based on their similarity, and (iii) evaluation of the influence of candidate explanatory variables on the identified clusters. The approach is tested onto a dataset of smart metered water and energy consumption data from over 1000 households in South California. Our methodology allows identifying heterogeneous groups of consumers from the studied sample, as well as characterizing them with respect to consumption profiles features and socio- demographic information. Results show how such better understanding of the considered users' community allows spotting potentially interesting areas for water and energy demand management interventions.
Design and Data in Balance: Using Design-Driven Decision Making to Enable Student Success

ERIC Educational Resources Information Center

Fairchild, Susan; Farrell, Timothy; Gunton, Brad; Mackinnon, Anne; McNamara, Christina; Trachtman, Roberta

2014-01-01

Data-driven approaches to school decision making have come into widespread use in the past decade, nationally and in New York City. New Visions has been at the forefront of those developments: in New Visions schools, teacher teams and school teams regularly examine student performance data to understand patterns and drive classroom- and…
Data-based adjoint and H2 optimal control of the Ginzburg-Landau equation

NASA Astrophysics Data System (ADS)

Banks, Michael; Bodony, Daniel

2017-11-01

Equation-free, reduced-order methods of control are desirable when the governing system of interest is of very high dimension or the control is to be applied to a physical experiment. Two-phase flow optimal control problems, our target application, fit these criteria. Dynamic Mode Decomposition (DMD) is a data-driven method for model reduction that can be used to resolve the dynamics of very high dimensional systems and project the dynamics onto a smaller, more manageable basis. We evaluate the effectiveness of DMD-based forward and adjoint operator estimation when applied to H2 optimal control approaches applied to the linear and nonlinear Ginzburg-Landau equation. Perspectives on applying the data-driven adjoint to two phase flow control will be given. Office of Naval Research (ONR) as part of the Multidisciplinary University Research Initiatives (MURI) Program, under Grant Number N00014-16-1-2617.
Metadata-Driven SOA-Based Application for Facilitation of Real-Time Data Warehousing

NASA Astrophysics Data System (ADS)

Pintar, Damir; Vranić, Mihaela; Skočir, Zoran

Service-oriented architecture (SOA) has already been widely recognized as an effective paradigm for achieving integration of diverse information systems. SOA-based applications can cross boundaries of platforms, operation systems and proprietary data standards, commonly through the usage of Web Services technology. On the other side, metadata is also commonly referred to as a potential integration tool given the fact that standardized metadata objects can provide useful information about specifics of unknown information systems with which one has interest in communicating with, using an approach commonly called "model-based integration". This paper presents the result of research regarding possible synergy between those two integration facilitators. This is accomplished with a vertical example of a metadata-driven SOA-based business process that provides ETL (Extraction, Transformation and Loading) and metadata services to a data warehousing system in need of a real-time ETL support.
Modelling heterogeneity variances in multiple treatment comparison meta-analysis--are informative priors the better solution?

PubMed

Thorlund, Kristian; Thabane, Lehana; Mills, Edward J

2013-01-11

Multiple treatment comparison (MTC) meta-analyses are commonly modeled in a Bayesian framework, and weakly informative priors are typically preferred to mirror familiar data driven frequentist approaches. Random-effects MTCs have commonly modeled heterogeneity under the assumption that the between-trial variance for all involved treatment comparisons are equal (i.e., the 'common variance' assumption). This approach 'borrows strength' for heterogeneity estimation across treatment comparisons, and thus, ads valuable precision when data is sparse. The homogeneous variance assumption, however, is unrealistic and can severely bias variance estimates. Consequently 95% credible intervals may not retain nominal coverage, and treatment rank probabilities may become distorted. Relaxing the homogeneous variance assumption may be equally problematic due to reduced precision. To regain good precision, moderately informative variance priors or additional mathematical assumptions may be necessary. In this paper we describe four novel approaches to modeling heterogeneity variance - two novel model structures, and two approaches for use of moderately informative variance priors. We examine the relative performance of all approaches in two illustrative MTC data sets. We particularly compare between-study heterogeneity estimates and model fits, treatment effect estimates and 95% credible intervals, and treatment rank probabilities. In both data sets, use of moderately informative variance priors constructed from the pair wise meta-analysis data yielded the best model fit and narrower credible intervals. Imposing consistency equations on variance estimates, assuming variances to be exchangeable, or using empirically informed variance priors also yielded good model fits and narrow credible intervals. The homogeneous variance model yielded high precision at all times, but overall inadequate estimates of between-trial variances. Lastly, treatment rankings were similar among the novel approaches, but considerably different when compared with the homogenous variance approach. MTC models using a homogenous variance structure appear to perform sub-optimally when between-trial variances vary between comparisons. Using informative variance priors, assuming exchangeability or imposing consistency between heterogeneity variances can all ensure sufficiently reliable and realistic heterogeneity estimation, and thus more reliable MTC inferences. All four approaches should be viable candidates for replacing or supplementing the conventional homogeneous variance MTC model, which is currently the most widely used in practice.
Predicting the trajectories and intensities of hurricanes by applying machine learning techniques

NASA Astrophysics Data System (ADS)

Sujithkumar, A.; King, A. W.; Kovilakam, M.; Graves, D.

2017-12-01

The world has witnessed an escalation of devastating hurricanes and tropical cyclones over the last three decades. Hurricanes and tropical cyclones of very high magnitude will likely be even more frequent in a warmer world. Thus, precise forecasting of the track and intensity of hurricane/tropical cyclones remains one of the meteorological community's top priorities. However, comprehensive prediction of hurricane/ tropical cyclone is a difficult problem due to the many complexities of underlying physical processes with many variables and complex relations. The availability of global meteorological and hurricane/tropical storm climatological data opens new opportunities for data-driven approaches to hurricane/tropical cyclone modeling. Here we report initial results from two data-driven machine learning techniques, specifically, random forest (RF) and Bayesian learning (BL) to predict the trajectory and intensity of hurricanes and tropical cyclones. We used International Best Track Archive for Climate Stewardship (IBTrACS) data along with weather data from NOAA in a 50 km buffer surrounding each of the reported hurricane and tropical cyclone tracts to train the model. Initial results reveal that both RF and BL are skillful in predicting storm intensity. We will also present results for the more complicated trajectory prediction.
Value Driven Information Processing and Fusion

DTIC Science & Technology

2016-03-01

consensus approach allows a decentralized approach to achieve the optimal error exponent of the centralized counterpart, a conclusion that is signifi...SECURITY CLASSIFICATION OF: The objective of the project is to develop a general framework for value driven decentralized information processing...including: optimal data reduction in a network setting for decentralized inference with quantization constraint; interactive fusion that allows queries and
Data-driven heterogeneity in mathematical learning disabilities based on the triple code model.

PubMed

Peake, Christian; Jiménez, Juan E; Rodríguez, Cristina

2017-12-01

Many classifications of heterogeneity in mathematical learning disabilities (MLD) have been proposed over the past four decades, however no empirical research has been conducted until recently, and none of the classifications are derived from Triple Code Model (TCM) postulates. The TCM proposes MLD as a heterogeneous disorder, with two distinguishable profiles: a representational subtype and a verbal subtype. A sample of elementary school 3rd to 6th graders was divided into two age cohorts (3rd - 4th grades, and 5th - 6th grades). Using data-driven strategies, based on the cognitive classification variables predicted by the TCM, our sample of children with MLD clustered as expected: a group with representational deficits and a group with number-fact retrieval deficits. In the younger group, a spatial subtype also emerged, while in both cohorts a non-specific cluster was produced whose profile could not be explained by this theoretical approach. Copyright © 2017 Elsevier Ltd. All rights reserved.
A co-clinical approach identifies mechanisms and potential therapies for androgen deprivation resistance in prostate cancer.

PubMed

Lunardi, Andrea; Ala, Ugo; Epping, Mirjam T; Salmena, Leonardo; Clohessy, John G; Webster, Kaitlyn A; Wang, Guocan; Mazzucchelli, Roberta; Bianconi, Maristella; Stack, Edward C; Lis, Rosina; Patnaik, Akash; Cantley, Lewis C; Bubley, Glenn; Cordon-Cardo, Carlos; Gerald, William L; Montironi, Rodolfo; Signoretti, Sabina; Loda, Massimo; Nardella, Caterina; Pandolfi, Pier Paolo

2013-07-01

Here we report an integrated analysis that leverages data from treatment of genetic mouse models of prostate cancer along with clinical data from patients to elucidate new mechanisms of castration resistance. We show that castration counteracts tumor progression in a Pten loss-driven mouse model of prostate cancer through the induction of apoptosis and proliferation block. Conversely, this response is bypassed with deletion of either Trp53 or Zbtb7a together with Pten, leading to the development of castration-resistant prostate cancer (CRPC). Mechanistically, the integrated acquisition of data from mouse models and patients identifies the expression patterns of XAF1, XIAP and SRD5A1 as a predictive and actionable signature for CRPC. Notably, we show that combined inhibition of XIAP, SRD5A1 and AR pathways overcomes castration resistance. Thus, our co-clinical approach facilitates the stratification of patients and the development of tailored and innovative therapeutic treatments.
Comparison between collective coordinate models for domain wall motion in PMA nanostrips in the presence of the Dzyaloshinskii-Moriya interaction

NASA Astrophysics Data System (ADS)

Vandermeulen, J.; Nasseri, S. A.; Van de Wiele, B.; Durin, G.; Van Waeyenberge, B.; Dupré, L.

2018-03-01

Lagrangian-based collective coordinate models for magnetic domain wall (DW) motion rely on an ansatz for the DW profile and a Lagrangian approach to describe the DW motion in terms of a set of time-dependent collective coordinates: the DW position, the DW magnetization angle, the DW width and the DW tilting angle. Another approach was recently used to derive similar equations of motion by averaging the Landau-Lifshitz-Gilbert equation without any ansatz, and identifying the relevant collective coordinates afterwards. In this paper, we use an updated version of the semi-analytical equations to compare the Lagrangian-based collective coordinate models with micromagnetic simulations for field- and STT-driven (spin-transfer torque-driven) DW motion in Pt/CoFe/MgO and Pt/Co/AlOx nanostrips. Through this comparison, we assess the accuracy of the different models, and provide insight into the deviations of the models from simulations. It is found that the lack of terms related to DW asymmetry in the Lagrangian-based collective coordinate models significantly contributes to the discrepancy between the predictions of the most accurate Lagrangian-based model and the micromagnetic simulations in the field-driven case. This is in contrast to the STT-driven case where the DW remains symmetric.
Non-linear Interactions between Niño region 3 and the Southern Amazon

NASA Astrophysics Data System (ADS)

Ramos, A. M. D. T.; Builes-Jaramillo, L. A.; Poveda, G.; Goswami, B.; Macau, E. E. N.; Kurths, J.; Marwan, N.

2016-12-01

Identifying causal relations from the observational dataset has posed great challenges in data-driven inference study. However, complex system framework offers promising approaches to tackle such problems. Here we propose a new data-driven causality inference method using the framework of recurrence plots. We present the Recurrence Measure of Conditional Dependence (RMCD) and its applications. The RMCD incorporates the recurrence behavior into the transfer entropy theory. Therefore, it quantifies the causal dependence between two processes based on joint recurrence patterns between the past of the potential driver and present on the potential driven, except for any contribution that has already been in the past of the driven. We apply this methodology to some paradigmatic models and to investigate the possible influence of the Pacific Ocean temperatures on the South West Amazon for the 2010 and 2005 droughts. The results reveal that for the 2005 drought there is not a significant signal of dependence from the Pacific Ocean and that for 2010 there is a signal of dependence of around 200 days. These outcomes are confirmed by the traditional climatological analysis of these episodes available in the literature and show the accuracy of RMCD inferring causal relations in climate systems.
The influence of data-driven versus conceptually-driven processing on the development of PTSD-like symptoms.

PubMed

Kindt, Merel; van den Hout, Marcel; Arntz, Arnoud; Drost, Jolijn

2008-12-01

Ehlers and Clark [(2000). A cognitive model of posttraumatic stress disorder. Behaviour Research and Therapy, 38, 319-345] propose that a predominance of data-driven processing during the trauma predicts subsequent PTSD. We wondered whether, apart from data-driven encoding, sustained data-driven processing after the trauma is also crucial for the development of PTSD. Both hypotheses were tested in two analogue experiments. Experiment 1 demonstrated that relative to conceptually-driven processing (n=20), data-driven processing after the film (n=14), resulted in more intrusions. Experiment 2 demonstrated that relative to the neutral condition (n=24) and the data-driven encoding condition (n=24), conceptual encoding (n=25) reduced suppression of intrusions and a trend emerged for memory fragmentation. The difference between the two encoding styles was due to the beneficial effect of induced conceptual encoding and not to the detrimental effect of data-driven encoding. The data support the viability of the distinction between data-driven/conceptually-driven processing for the understanding of the development of PTSD.
Data-driven local-scale modeling of ionospheric responses to auroral forcing using incoherent scatter radar and ground-based imaging measurements

NASA Astrophysics Data System (ADS)

Grubbs, G. A., II; Zettergren, M. D.; Samara, M.; Michell, R.; Hampton, D. L.; Lynch, K. A.; Varney, R. H.; Reimer, A.; Burleigh, M.

2017-12-01

The aurora encapsulates a wide range of spatial and temporal scale sizes, particularly during active events such as those that exist during substorm expansion. Of interest to the present work are ionospheric responses to magnetospheric forcing at relatively small scales (0.5-20 km), including formation of structured auroral arc current systems, ion frictional heating, upflow, and density cavity formation among other processes. Even for carefully arranged experiments, it is often difficult to fully assess physical details (time evolution, causality, unobservable parameters) associated with these types of responses, thus highlighting the general need for high-resolution modeling efforts to support the observations. In this work, we develop and test a local-scale model to describe effects of precipitating electrons and electric fields on the ionospheric plasma responses using available remote sensing data (e.g. from ISRs and filtered cameras). Our model is based on a 3D multi-fluid/electrostatic ionospheric model, GEMINI (Zettergren et al., 2015), coupled a two-stream electron transport code which produces auroral intensities, impact ionization, and thermal electron heating GLobal airglOW (GLOW; Solomon, 2017). GEMINI-GLOW thus describes both thermal and suprathermal effects on the ionosphere and is driven by boundary conditions consisting of topside ionospheric field-aligned currents and suprathermal electrons. These boundary conditions are constrained using time and space-dependent electric field and precipitation estimates from recent sounding rocket campaigns, ISINGLASS (02 March 2017) and GREECE (03 March 2014), derived from the Poker Flat incoherent scatter radar (PFISR) drifts and filtered EMCCD cameras respectively. Results from these data-driven case studies are compared to plasma parameter responses (i.e. density and temperature) independently estimated by PFISR and from the sounding rockets. These studies are intended as a first step towards a local-scale assimilative modeling approach where data-derived information will be fed back into the model to update the system state.
Forecasting wetting and drying of post-wildfire soils in response to precipitation: A time series optimization approach

NASA Astrophysics Data System (ADS)

Basak, A.; Kulkarni, C.; Schmidt, K. M.; Mengshoel, O. J.

2015-12-01

Volumetric water content (VWC) in soils is critical for forecasting thresholds for runoff-driven erosion caused by rainfall. Even though theoretical relations (e.g., Richards equation) have been developed to quantify VWC in unsaturated granular soils, site-specific field conditions and hysteresis of suction and VWC in soil preclude their direct use. Although attempts have previously been made to forecast VWC using various time-series models (e.g., autoregressive integrated moving average or ARIMA), these approaches lack hydrologic foundations and perform poorly when used to forecast VWC over time periods longer than 24 hours. In this work, we extend an existing Antecedent Water Index (AWI) based model to express VWC as a function of time and rainfall. AWI models typically overfit data and cannot be used for forecast VWC over long time periods. We developed a new model to overcome this limitation, which accumulates rainfall over a time window and fits a diverse range of wetting and drying curves. Hydraulic redistribution parameters in this model bear resemblance to hydrologic processes driven by gravity and suction. This model reasonably forecasts VWC using only initial VWC values and rainfall forecasts. Experimental VWC data were collected from steep gradient post-wildfire sites in southern California. Rapid landscape change was observed in response to small to moderate rain storms. We formulated a mean-squared error minimization problem over the model parameters and optimized using genetic algorithms. We found that our model fits VWC data for 3 distinct soil textures, each occurring at 3 different depths below the ground surface (5 cm, 15 cm, and 30 cm). Our model successfully forecasts VWC trends, such as drying and wetting rate. To a certain extent, our model achieves spatial and seasonal generalizability. Our accumulative rainfall model is also applicable to continuous predictions, where VWC values are repeatedly used to predict future ones within a 12-hr time frame.
Framework for a clinical information system.

PubMed

Van de Velde, R

2000-01-01

The current status of our work towards the design and implementation of a reference architecture for a Clinical Information System is presented. This architecture has been developed and implemented based on components following a strong underlying conceptual and technological model. Common Object Request Broker and n-tier technology featuring centralised and departmental clinical information systems as the back-end store for all clinical data are used. Servers located in the 'middle' tier apply the clinical (business) model and application rules to communicate with so-called 'thin client' workstations. The main characteristics are the focus on modelling and reuse of both data and business logic as there is a shift away from data and functional modelling towards object modelling. Scalability as well as adaptability to constantly changing requirements via component driven computing are the main reasons for that approach.
Comparison of empirical and data driven hydrometeorological hazard models on coastal cities of São Paulo, Brazil

NASA Astrophysics Data System (ADS)

Koga-Vicente, A.; Friedel, M. J.

2010-12-01

Every year thousands of people are affected by floods and landslide hazards caused by rainstorms. The problem is more serious in tropical developing countries because of the susceptibility as a result of the high amount of available energy to form storms, and the high vulnerability due to poor economic and social conditions. Predictive models of hazards are important tools to manage this kind of risk. In this study, a comparison of two different modeling approaches was made for predicting hydrometeorological hazards in 12 cities on the coast of São Paulo, Brazil, from 1994 to 2003. In the first approach, an empirical multiple linear regression (MLR) model was developed and used; the second approach used a type of unsupervised nonlinear artificial neural network called a self-organized map (SOM). By using twenty three independent variables of susceptibility (precipitation, soil type, slope, elevation, and regional atmospheric system scale) and vulnerability (distribution and total population, income and educational characteristics, poverty intensity, human development index), binary hazard responses were obtained. Model performance by cross-validation indicated that the respective MLR and SOM model accuracy was about 67% and 80%. Prediction accuracy can be improved by the addition of information, but the SOM approach is preferred because of sparse data and highly nonlinear relations among the independent variables.
A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults.

PubMed

Sun, Rui; Cheng, Qi; Wang, Guanyu; Ochieng, Washington Yotto

2017-09-29

The use of Unmanned Aerial Vehicles (UAVs) has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs' flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS)-based approach is presented for the detection of on-board navigation sensor faults in UAVs. Contrary to the classic UAV sensor fault detection algorithms, based on predefined or modelled faults, the proposed algorithm combines an online data training mechanism with the ANFIS-based decision system. The main advantages of this algorithm are that it allows real-time model-free residual analysis from Kalman Filter (KF) estimates and the ANFIS to build a reliable fault detection system. In addition, it allows fast and accurate detection of faults, which makes it suitable for real-time applications. Experimental results have demonstrated the effectiveness of the proposed fault detection method in terms of accuracy and misdetection rate.
Qualitatively modelling and analysing genetic regulatory networks: a Petri net approach.

PubMed

Steggles, L Jason; Banks, Richard; Shaw, Oliver; Wipat, Anil

2007-02-01

New developments in post-genomic technology now provide researchers with the data necessary to study regulatory processes in a holistic fashion at multiple levels of biological organization. One of the major challenges for the biologist is to integrate and interpret these vast data resources to gain a greater understanding of the structure and function of the molecular processes that mediate adaptive and cell cycle driven changes in gene expression. In order to achieve this biologists require new tools and techniques to allow pathway related data to be modelled and analysed as network structures, providing valuable insights which can then be validated and investigated in the laboratory. We propose a new technique for constructing and analysing qualitative models of genetic regulatory networks based on the Petri net formalism. We take as our starting point the Boolean network approach of treating genes as binary switches and develop a new Petri net model which uses logic minimization to automate the construction of compact qualitative models. Our approach addresses the shortcomings of Boolean networks by providing access to the wide range of existing Petri net analysis techniques and by using non-determinism to cope with incomplete and inconsistent data. The ideas we present are illustrated by a case study in which the genetic regulatory network controlling sporulation in the bacterium Bacillus subtilis is modelled and analysed. The Petri net model construction tool and the data files for the B. subtilis sporulation case study are available at http://bioinf.ncl.ac.uk/gnapn.

Data-driven analysis of functional brain interactions during free listening to music and speech.

PubMed

Fang, Jun; Hu, Xintao; Han, Junwei; Jiang, Xi; Zhu, Dajiang; Guo, Lei; Liu, Tianming

2015-06-01

Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain's functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain's mechanism in comprehension of complex natural music and speech.
A data driven method for estimation of B(avail) and appK(D) using a single injection protocol with [¹¹C]raclopride in the mouse.

PubMed

Wimberley, Catriona J; Fischer, Kristina; Reilhac, Anthonin; Pichler, Bernd J; Gregoire, Marie Claude

2014-10-01

The partial saturation approach (PSA) is a simple, single injection experimental protocol that will estimate both B(avail) and appK(D) without the use of blood sampling. This makes it ideal for use in longitudinal studies of neurodegenerative diseases in the rodent. The aim of this study was to increase the range and applicability of the PSA by developing a data driven strategy for determining reliable regional estimates of receptor density (B(avail)) and in vivo affinity (1/appK(D)), and validate the strategy using a simulation model. The data driven method uses a time window guided by the dynamic equilibrium state of the system as opposed to using a static time window. To test the method, simulations of partial saturation experiments were generated and validated against experimental data. The experimental conditions simulated included a range of receptor occupancy levels and three different B(avail) and appK(D) values to mimic diseases states. Also the effect of using a reference region and typical PET noise on the stability and accuracy of the estimates was investigated. The investigations showed that the parameter estimates in a simulated healthy mouse, using the data driven method were within 10±30% of the simulated input for the range of occupancy levels simulated. Throughout all experimental conditions simulated, the accuracy and robustness of the estimates using the data driven method were much improved upon the typical method of using a static time window, especially at low receptor occupancy levels. Introducing a reference region caused a bias of approximately 10% over the range of occupancy levels. Based on extensive simulated experimental conditions, it was shown the data driven method provides accurate and precise estimates of B(avail) and appK(D) for a broader range of conditions compared to the original method. Copyright © 2014 Elsevier Inc. All rights reserved.
A Novel Online Sequential Extreme Learning Machine for Gas Utilization Ratio Prediction in Blast Furnaces.

PubMed

Li, Yanjiao; Zhang, Sen; Yin, Yixin; Xiao, Wendong; Zhang, Jie

2017-08-10

Gas utilization ratio (GUR) is an important indicator used to measure the operating status and energy consumption of blast furnaces (BFs). In this paper, we present a soft-sensor approach, i.e., a novel online sequential extreme learning machine (OS-ELM) named DU-OS-ELM, to establish a data-driven model for GUR prediction. In DU-OS-ELM, firstly, the old collected data are discarded gradually and the newly acquired data are given more attention through a novel dynamic forgetting factor (DFF), depending on the estimation errors to enhance the dynamic tracking ability. Furthermore, we develop an updated selection strategy (USS) to judge whether the model needs to be updated with the newly coming data, so that the proposed approach is more in line with the actual production situation. Then, the convergence analysis of the proposed DU-OS-ELM is presented to ensure the estimation of output weight converge to the true value with the new data arriving. Meanwhile, the proposed DU-OS-ELM is applied to build a soft-sensor model to predict GUR. Experimental results demonstrate that the proposed DU-OS-ELM obtains better generalization performance and higher prediction accuracy compared with a number of existing related approaches using the real production data from a BF and the created GUR prediction model can provide an effective guidance for further optimization operation.
A Novel Online Sequential Extreme Learning Machine for Gas Utilization Ratio Prediction in Blast Furnaces

PubMed Central

Li, Yanjiao; Yin, Yixin; Xiao, Wendong; Zhang, Jie

2017-01-01

Gas utilization ratio (GUR) is an important indicator used to measure the operating status and energy consumption of blast furnaces (BFs). In this paper, we present a soft-sensor approach, i.e., a novel online sequential extreme learning machine (OS-ELM) named DU-OS-ELM, to establish a data-driven model for GUR prediction. In DU-OS-ELM, firstly, the old collected data are discarded gradually and the newly acquired data are given more attention through a novel dynamic forgetting factor (DFF), depending on the estimation errors to enhance the dynamic tracking ability. Furthermore, we develop an updated selection strategy (USS) to judge whether the model needs to be updated with the newly coming data, so that the proposed approach is more in line with the actual production situation. Then, the convergence analysis of the proposed DU-OS-ELM is presented to ensure the estimation of output weight converge to the true value with the new data arriving. Meanwhile, the proposed DU-OS-ELM is applied to build a soft-sensor model to predict GUR. Experimental results demonstrate that the proposed DU-OS-ELM obtains better generalization performance and higher prediction accuracy compared with a number of existing related approaches using the real production data from a BF and the created GUR prediction model can provide an effective guidance for further optimization operation. PMID:28796187
Data Analysis and Data Mining: Current Issues in Biomedical Informatics

PubMed Central

Bellazzi, Riccardo; Diomidous, Marianna; Sarkar, Indra Neil; Takabayashi, Katsuhiko; Ziegler, Andreas; McCray, Alexa T.

2011-01-01

Summary Background Medicine and biomedical sciences have become data-intensive fields, which, at the same time, enable the application of data-driven approaches and require sophisticated data analysis and data mining methods. Biomedical informatics provides a proper interdisciplinary context to integrate data and knowledge when processing available information, with the aim of giving effective decision-making support in clinics and translational research. Objectives To reflect on different perspectives related to the role of data analysis and data mining in biomedical informatics. Methods On the occasion of the 50th year of Methods of Information in Medicine a symposium was organized, that reflected on opportunities, challenges and priorities of organizing, representing and analysing data, information and knowledge in biomedicine and health care. The contributions of experts with a variety of backgrounds in the area of biomedical data analysis have been collected as one outcome of this symposium, in order to provide a broad, though coherent, overview of some of the most interesting aspects of the field. Results The paper presents sections on data accumulation and data-driven approaches in medical informatics, data and knowledge integration, statistical issues for the evaluation of data mining models, translational bioinformatics and bioinformatics aspects of genetic epidemiology. Conclusions Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context. In the future, it will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers. PMID:22146916
HIGH-FIDELITY SIMULATION-DRIVEN MODEL DEVELOPMENT FOR COARSE-GRAINED COMPUTATIONAL FLUID DYNAMICS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanna, Botros N.; Dinh, Nam T.; Bolotnov, Igor A.

Nuclear reactor safety analysis requires identifying various credible accident scenarios and determining their consequences. For a full-scale nuclear power plant system behavior, it is impossible to obtain sufficient experimental data for a broad range of risk-significant accident scenarios. In single-phase flow convective problems, Direct Numerical Simulation (DNS) and Large Eddy Simulation (LES) can provide us with high fidelity results when physical data are unavailable. However, these methods are computationally expensive and cannot be afforded for simulation of long transient scenarios in nuclear accidents despite extraordinary advances in high performance scientific computing over the past decades. The major issue is themore » inability to make the transient computation parallel, thus making number of time steps required in high-fidelity methods unaffordable for long transients. In this work, we propose to apply a high fidelity simulation-driven approach to model sub-grid scale (SGS) effect in Coarse Grained Computational Fluid Dynamics CG-CFD. This approach aims to develop a statistical surrogate model instead of the deterministic SGS model. We chose to start with a turbulent natural convection case with volumetric heating in a horizontal fluid layer with a rigid, insulated lower boundary and isothermal (cold) upper boundary. This scenario of unstable stratification is relevant to turbulent natural convection in a molten corium pool during a severe nuclear reactor accident, as well as in containment mixing and passive cooling. The presented approach demonstrates how to create a correction for the CG-CFD solution by modifying the energy balance equation. A global correction for the temperature equation proves to achieve a significant improvement to the prediction of steady state temperature distribution through the fluid layer.« less
Energy landscape analysis of neuroimaging data

NASA Astrophysics Data System (ADS)

Ezaki, Takahiro; Watanabe, Takamitsu; Ohzeki, Masayuki; Masuda, Naoki

2017-05-01

Computational neuroscience models have been used for understanding neural dynamics in the brain and how they may be altered when physiological or other conditions change. We review and develop a data-driven approach to neuroimaging data called the energy landscape analysis. The methods are rooted in statistical physics theory, in particular the Ising model, also known as the (pairwise) maximum entropy model and Boltzmann machine. The methods have been applied to fitting electrophysiological data in neuroscience for a decade, but their use in neuroimaging data is still in its infancy. We first review the methods and discuss some algorithms and technical aspects. Then, we apply the methods to functional magnetic resonance imaging data recorded from healthy individuals to inspect the relationship between the accuracy of fitting, the size of the brain system to be analysed and the data length. This article is part of the themed issue `Mathematical methods in medicine: neuroscience, cardiology and pathology'.
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

DOE PAGES

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; ...

2013-01-01

Background . The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective . To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods . The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expertmore » knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results . The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions . Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

PubMed Central

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Varnum, Susan M.; Brown, Joseph N.; Riensche, Roderick M.; Adkins, Joshua N.; Jacobs, Jon M.; Hoidal, John R.; Scholand, Mary Beth; Pounds, Joel G.; Blackburn, Michael R.; Rodland, Karin D.; McDermott, Jason E.

2013-01-01

Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification. PMID:24223463
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.

2013-10-01

Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integratedmore » into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
Joint independent component analysis for simultaneous EEG-fMRI: principle and simulation.

PubMed

Moosmann, Matthias; Eichele, Tom; Nordby, Helge; Hugdahl, Kenneth; Calhoun, Vince D

2008-03-01

An optimized scheme for the fusion of electroencephalography and event related potentials with functional magnetic resonance imaging (BOLD-fMRI) data should simultaneously assess all available electrophysiologic and hemodynamic information in a common data space. In doing so, it should be possible to identify features of latent neural sources whose trial-to-trial dynamics are jointly reflected in both modalities. We present a joint independent component analysis (jICA) model for analysis of simultaneous single trial EEG-fMRI measurements from multiple subjects. We outline the general idea underlying the jICA approach and present results from simulated data under realistic noise conditions. Our results indicate that this approach is a feasible and physiologically plausible data-driven way to achieve spatiotemporal mapping of event related responses in the human brain.
Role of intraglomerular circuits in shaping temporally structured responses to naturalistic inhalation-driven sensory input to the olfactory bulb

PubMed Central

Carey, Ryan M.; Sherwood, William Erik; Shipley, Michael T.; Borisyuk, Alla

2015-01-01

Olfaction in mammals is a dynamic process driven by the inhalation of air through the nasal cavity. Inhalation determines the temporal structure of sensory neuron responses and shapes the neural dynamics underlying central olfactory processing. Inhalation-linked bursts of activity among olfactory bulb (OB) output neurons [mitral/tufted cells (MCs)] are temporally transformed relative to those of sensory neurons. We investigated how OB circuits shape inhalation-driven dynamics in MCs using a modeling approach that was highly constrained by experimental results. First, we constructed models of canonical OB circuits that included mono- and disynaptic feedforward excitation, recurrent inhibition and feedforward inhibition of the MC. We then used experimental data to drive inputs to the models and to tune parameters; inputs were derived from sensory neuron responses during natural odorant sampling (sniffing) in awake rats, and model output was compared with recordings of MC responses to odorants sampled with the same sniff waveforms. This approach allowed us to identify OB circuit features underlying the temporal transformation of sensory inputs into inhalation-linked patterns of MC spike output. We found that realistic input-output transformations can be achieved independently by multiple circuits, including feedforward inhibition with slow onset and decay kinetics and parallel feedforward MC excitation mediated by external tufted cells. We also found that recurrent and feedforward inhibition had differential impacts on MC firing rates and on inhalation-linked response dynamics. These results highlight the importance of investigating neural circuits in a naturalistic context and provide a framework for further explorations of signal processing by OB networks. PMID:25717156
Data-driven RANS for simulations of large wind farms

NASA Astrophysics Data System (ADS)

Iungo, G. V.; Viola, F.; Ciri, U.; Rotea, M. A.; Leonardi, S.

2015-06-01

In the wind energy industry there is a growing need for real-time predictions of wind turbine wake flows in order to optimize power plant control and inhibit detrimental wake interactions. To this aim, a data-driven RANS approach is proposed in order to achieve very low computational costs and adequate accuracy through the data assimilation procedure. The RANS simulations are implemented with a classical Boussinesq hypothesis and a mixing length turbulence closure model, which is calibrated through the available data. High-fidelity LES simulations of a utility-scale wind turbine operating with different tip speed ratios are used as database. It is shown that the mixing length model for the RANS simulations can be calibrated accurately through the Reynolds stress of the axial and radial velocity components, and the gradient of the axial velocity in the radial direction. It is found that the mixing length is roughly invariant in the very near wake, then it increases linearly with the downstream distance in the diffusive region. The variation rate of the mixing length in the downstream direction is proposed as a criterion to detect the transition between near wake and transition region of a wind turbine wake. Finally, RANS simulations were performed with the calibrated mixing length model, and a good agreement with the LES simulations is observed.
Forecasting Chikungunya spread in the Americas via data-driven empirical approaches.

PubMed

Escobar, Luis E; Qiao, Huijie; Peterson, A Townsend

2016-02-29

Chikungunya virus (CHIKV) is endemic to Africa and Asia, but the Asian genotype invaded the Americas in 2013. The fast increase of human infections in the American epidemic emphasized the urgency of developing detailed predictions of case numbers and the potential geographic spread of this disease. We developed a simple model incorporating cases generated locally and cases imported from other countries, and forecasted transmission hotspots at the level of countries and at finer scales, in terms of ecological features. By late January 2015, >1.2 M CHIKV cases were reported from the Americas, with country-level prevalences between nil and more than 20 %. In the early stages of the epidemic, exponential growth in case numbers was common; later, however, poor and uneven reporting became more common, in a phenomenon we term "surveillance fatigue." Economic activity of countries was not associated with prevalence, but diverse social factors may be linked to surveillance effort and reporting. Our model predictions were initially quite inaccurate, but improved markedly as more data accumulated within the Americas. The data-driven methodology explored in this study provides an opportunity to generate descriptive and predictive information on spread of emerging diseases in the short-term under simple models based on open-access tools and data that can inform early-warning systems and public health intelligence.
Conceptual achievement of 1GBq activity in a Plasma Focus driven system.

PubMed

Tabbakh, Farshid; Sadat Kiai, Seyed Mahmood; Pashaei, Mohammad

2017-11-01

This is an approach to evaluate the radioisotope production by means of typical dense plasma focus devices. The production rate of the appropriate positron emitters, F-18, N-13 and O-15 has been studied. The beam-target mechanism was simulated by GEANT4 Monte Carlo tool using QGSP_BIC and QGSP_INCLXX physic models as comparison. The results for positron emitters have been evaluated by reported experimental data and found conformity between simulations and experimental reports that leads to using this code as a reliable tool in optimizing the DPF driven systems for achieving to 1GBq activity of produced radioisotope. Copyright © 2017 Elsevier Ltd. All rights reserved.
Neural network versus classical time series forecasting models

NASA Astrophysics Data System (ADS)

Nor, Maria Elena; Safuan, Hamizah Mohd; Shab, Noorzehan Fazahiyah Md; Asrul, Mohd; Abdullah, Affendi; Mohamad, Nurul Asmaa Izzati; Lee, Muhammad Hisyam

2017-05-01

Artificial neural network (ANN) has advantage in time series forecasting as it has potential to solve complex forecasting problems. This is because ANN is data driven approach which able to be trained to map past values of a time series. In this study the forecast performance between neural network and classical time series forecasting method namely seasonal autoregressive integrated moving average models was being compared by utilizing gold price data. Moreover, the effect of different data preprocessing on the forecast performance of neural network being examined. The forecast accuracy was evaluated using mean absolute deviation, root mean square error and mean absolute percentage error. It was found that ANN produced the most accurate forecast when Box-Cox transformation was used as data preprocessing.
Orthogonal-blendshape-based editing system for facial motion capture data.

PubMed

Li, Qing; Deng, Zhigang

2008-01-01

The authors present a novel data-driven 3D facial motion capture data editing system using automated construction of an orthogonal blendshape face model and constrained weight propagation, aiming to bridge the popular facial motion capture technique and blendshape approach. In this work, a 3D facial-motion-capture-editing problem is transformed to a blendshape-animation-editing problem. Given a collected facial motion capture data set, we construct a truncated PCA space spanned by the greatest retained eigenvectors and a corresponding blendshape face model for each anatomical region of the human face. As such, modifying blendshape weights (PCA coefficients) is equivalent to editing their corresponding motion capture sequence. In addition, a constrained weight propagation technique allows animators to balance automation and flexible controls.
Dynamic modeling and motion simulation for a winged hybrid-driven underwater glider

NASA Astrophysics Data System (ADS)

Wang, Shu-Xin; Sun, Xiu-Jun; Wang, Yan-Hui; Wu, Jian-Guo; Wang, Xiao-Ming

2011-03-01

PETREL, a winged hybrid-driven underwater glider is a novel and practical marine survey platform which combines the features of legacy underwater glider and conventional AUV (autonomous underwater vehicle). It can be treated as a multi-rigid-body system with a floating base and a particular hydrodynamic profile. In this paper, theorems on linear and angular momentum are used to establish the dynamic equations of motion of each rigid body and the effect of translational and rotational motion of internal masses on the attitude control are taken into consideration. In addition, due to the unique external shape with fixed wings and deflectable rudders and the dual-drive operation in thrust and glide modes, the approaches of building dynamic model of conventional AUV and hydrodynamic model of submarine are introduced, and the tailored dynamic equations of the hybrid glider are formulated. Moreover, the behaviors of motion in glide and thrust operation are analyzed based on the simulation and the feasibility of the dynamic model is validated by data from lake field trials.
Event-driven simulation in SELMON: An overview of EDSE

NASA Technical Reports Server (NTRS)

Rouquette, Nicolas F.; Chien, Steve A.; Charest, Leonard, Jr.

1992-01-01

EDSE (event-driven simulation engine), a model-based event-driven simulator implemented for SELMON, a tool for sensor selection and anomaly detection in real-time monitoring is described. The simulator is used in conjunction with a causal model to predict future behavior of the model from observed data. The behavior of the causal model is interpreted as equivalent to the behavior of the physical system being modeled. An overview of the functionality of the simulator and the model-based event-driven simulation paradigm on which it is based is provided. Included are high-level descriptions of the following key properties: event consumption and event creation, iterative simulation, synchronization and filtering of monitoring data from the physical system. Finally, how EDSE stands with respect to the relevant open issues of discrete-event and model-based simulation is discussed.
The Future of Risk Analysis: Operationalizing Living Vulnerability Assessments from the Cloud to the Street (and Back)

NASA Astrophysics Data System (ADS)

Tellman, B.; Schwarz, B.; Kuhn, C.; Pandey, B.; Schank, C.; Sullivan, J.; Mahtta, R.; Hammet, L.

2016-12-01

21 million people are exposed to flooding every year, and that number is expected to more than double by 2030 due to climate, land use, and demographic change. Cloud to Street, a mission driven science organization, is working to make big and real time data more meaningful to understand both biophysical and social vulnerability to flooding in this changing world. This talk will showcase the science and practice we have built of integrated social and biophysical flood vulnerability assessments based on our work in Uttarakhand, India and Senegal, in conjunction with nonprofits and development banks. We will show developments of our global historical flood database, detected from MODIS and Landsat satellites, used to power machine learning flood exposure models in Google Earth Engine's API. Demonstrating the approach, we will also showcase new approaches in social vulnerability science, from developing data-driven social vulnerability indices in India, to deriving predictive models that explain the social conditions that lead to disproportionate flood damage and fatality in the US. While this talk will draw on examples of completed vulnerability assessments, we will also discuss the possible future for place-based "living" flood vulnerability assessments that are updated each time satellites circle the earth or people add crowd-sourced observations about flood events and social conditions.

Doing Science that Matters to Address India's Water Crisis.

NASA Astrophysics Data System (ADS)

Srinivasan, V.

2017-12-01

Addressing water security in developing regions involves predicting water availability under unprecedented rates of population and economic growth. India is one of the most water stressed countries in the world. Despite appreciable increases in funding for water research, high quality science that is usable by stakeholders remains elusive. The absence of usable research, has been driven by notions of what is publishable in the developed world. This can be attributed to the absence of problem driven research on questions that actually matter to stakeholders, unwillingness to transcend disciplinary boundaries and the demise of a field-work research culture in favour of computer simulation. Yet the combination of rapid change, inadequate data and human modifications to watersheds poses a challenge, as researchers face a poorly constrained water resources modelling problem. Instead, what India and indeed all developing regions need is to approach the problem from first principles, identifying the most critical knowledge gaps, then prioritizing data collection using novel sensing and modelling approaches to address them. This might also necessitate consideration of underlying social and governance drivers of hydrologic change. Using examples from research in the Cauvery Basin, a highly contentious inter-state river basin, I offer some insights into framing "use-inspired" research agenda and show how the research generates not just new scientific insights but may be translated into practice.
Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation.

PubMed

Wang, Shuo; Zhou, Mu; Liu, Zaiyi; Liu, Zhenyu; Gu, Dongsheng; Zang, Yali; Dong, Di; Gevaert, Olivier; Tian, Jie

2017-08-01

Accurate lung nodule segmentation from computed tomography (CT) images is of great importance for image-driven lung cancer analysis. However, the heterogeneity of lung nodules and the presence of similar visual characteristics between nodules and their surroundings make it difficult for robust nodule segmentation. In this study, we propose a data-driven model, termed the Central Focused Convolutional Neural Networks (CF-CNN), to segment lung nodules from heterogeneous CT images. Our approach combines two key insights: 1) the proposed model captures a diverse set of nodule-sensitive features from both 3-D and 2-D CT images simultaneously; 2) when classifying an image voxel, the effects of its neighbor voxels can vary according to their spatial locations. We describe this phenomenon by proposing a novel central pooling layer retaining much information on voxel patch center, followed by a multi-scale patch learning strategy. Moreover, we design a weighted sampling to facilitate the model training, where training samples are selected according to their degree of segmentation difficulty. The proposed method has been extensively evaluated on the public LIDC dataset including 893 nodules and an independent dataset with 74 nodules from Guangdong General Hospital (GDGH). We showed that CF-CNN achieved superior segmentation performance with average dice scores of 82.15% and 80.02% for the two datasets respectively. Moreover, we compared our results with the inter-radiologists consistency on LIDC dataset, showing a difference in average dice score of only 1.98%. Copyright © 2017. Published by Elsevier B.V.
Predicting Seagrass Occurrence in a Changing Climate Using Random Forests

NASA Astrophysics Data System (ADS)

Aydin, O.; Butler, K. A.

2017-12-01

Seagrasses are marine plants that can quickly sequester vast amounts of carbon (up to 100 times more and 12 times faster than tropical forests). In this work, we present an integrated GIS and machine learning approach to build a data-driven model of seagrass presence-absence. We outline a random forest approach that avoids the prevalence bias in many ecological presence-absence models. One of our goals is to predict global seagrass occurrence from a spatially limited training sample. In addition, we conduct a sensitivity study which investigates the vulnerability of seagrass to changing climate conditions. We integrate multiple data sources including fine-scale seagrass data from MarineCadastre.gov and the recently available globally extensive publicly available Ecological Marine Units (EMU) dataset. These data are used to train a model for seagrass occurrence along the U.S. coast. In situ oceans data are interpolated using Empirical Bayesian Kriging (EBK) to produce globally extensive prediction variables. A neural network is used to estimate probable future values of prediction variables such as ocean temperature to assess the impact of a warming climate on seagrass occurrence. The proposed workflow can be generalized to many presence-absence models.
Data-Driven Leadership: Determining Your Indicators and Building Your Dashboards

ERIC Educational Resources Information Center

Copeland, Mo

2016-01-01

For years, schools have tended to approach budgets with some basic assumptions and aspirations and general wish lists but with scant data to drive the budget conversation. Suppose there were a better way? What if the conversation started with a review of the last five to ten years of data on three key mission- and strategy-driven indicators:…
Subsampling for dataset optimisation

NASA Astrophysics Data System (ADS)

Ließ, Mareike

2017-04-01

Soil-landscapes have formed by the interaction of soil-forming factors and pedogenic processes. In modelling these landscapes in their pedodiversity and the underlying processes, a representative unbiased dataset is required. This concerns model input as well as output data. However, very often big datasets are available which are highly heterogeneous and were gathered for various purposes, but not to model a particular process or data space. As a first step, the overall data space and/or landscape section to be modelled needs to be identified including considerations regarding scale and resolution. Then the available dataset needs to be optimised via subsampling to well represent this n-dimensional data space. A couple of well-known sampling designs may be adapted to suit this purpose. The overall approach follows three main strategies: (1) the data space may be condensed and de-correlated by a factor analysis to facilitate the subsampling process. (2) Different methods of pattern recognition serve to structure the n-dimensional data space to be modelled into units which then form the basis for the optimisation of an existing dataset through a sensible selection of samples. Along the way, data units for which there is currently insufficient soil data available may be identified. And (3) random samples from the n-dimensional data space may be replaced by similar samples from the available dataset. While being a presupposition to develop data-driven statistical models, this approach may also help to develop universal process models and identify limitations in existing models.
The Knowledge-Based Software Assistant: Beyond CASE

NASA Technical Reports Server (NTRS)

Carozzoni, Joseph A.

1993-01-01

This paper will outline the similarities and differences between two paradigms of software development. Both support the whole software life cycle and provide automation for most of the software development process, but have different approaches. The CASE approach is based on a set of tools linked by a central data repository. This tool-based approach is data driven and views software development as a series of sequential steps, each resulting in a product. The Knowledge-Based Software Assistant (KBSA) approach, a radical departure from existing software development practices, is knowledge driven and centers around a formalized software development process. KBSA views software development as an incremental, iterative, and evolutionary process with development occurring at the specification level.
Connecting Brain Proteomics with Behavioural Neuroscience in Translational Animal Models of Neuropsychiatric Disorders.

PubMed

Sarnyai, Zoltán; Guest, Paul C

2017-01-01

Modelling psychiatric disorders in animals has been hindered by several challenges related to our poor understanding of the disease causes. This chapter describes recent advances in translational research which may lead to animal models and relevant proteomic biomarkers that can be informative about disease mechanisms and potential new therapeutic targets. The review focuses on the behavioural and molecular correlates in models of schizophrenia and major depressive disorder, as guided by recently established Research Domain Criteria (RDoC). This approach is based on providing proteomic data for aetiologically driven, behaviourally well-characterised animal models to link discovered biomarker candidates with the human disease.
Data-adaptive harmonic spectra and multilayer Stuart-Landau models

NASA Astrophysics Data System (ADS)

Chekroun, Mickaël D.; Kondrashov, Dmitri

2017-09-01

Harmonic decompositions of multivariate time series are considered for which we adopt an integral operator approach with periodic semigroup kernels. Spectral decomposition theorems are derived that cover the important cases of two-time statistics drawn from a mixing invariant measure. The corresponding eigenvalues can be grouped per Fourier frequency and are actually given, at each frequency, as the singular values of a cross-spectral matrix depending on the data. These eigenvalues obey, furthermore, a variational principle that allows us to define naturally a multidimensional power spectrum. The eigenmodes, as far as they are concerned, exhibit a data-adaptive character manifested in their phase which allows us in turn to define a multidimensional phase spectrum. The resulting data-adaptive harmonic (DAH) modes allow for reducing the data-driven modeling effort to elemental models stacked per frequency, only coupled at different frequencies by the same noise realization. In particular, the DAH decomposition extracts time-dependent coefficients stacked by Fourier frequency which can be efficiently modeled—provided the decay of temporal correlations is sufficiently well-resolved—within a class of multilayer stochastic models (MSMs) tailored here on stochastic Stuart-Landau oscillators. Applications to the Lorenz 96 model and to a stochastic heat equation driven by a space-time white noise are considered. In both cases, the DAH decomposition allows for an extraction of spatio-temporal modes revealing key features of the dynamics in the embedded phase space. The multilayer Stuart-Landau models (MSLMs) are shown to successfully model the typical patterns of the corresponding time-evolving fields, as well as their statistics of occurrence.
Physical and JIT Model Based Hybrid Modeling Approach for Building Thermal Load Prediction

NASA Astrophysics Data System (ADS)

Iino, Yutaka; Murai, Masahiko; Murayama, Dai; Motoyama, Ichiro

Energy conservation in building fields is one of the key issues in environmental point of view as well as that of industrial, transportation and residential fields. The half of the total energy consumption in a building is occupied by HVAC (Heating, Ventilating and Air Conditioning) systems. In order to realize energy conservation of HVAC system, a thermal load prediction model for building is required. This paper propose a hybrid modeling approach with physical and Just-in-Time (JIT) model for building thermal load prediction. The proposed method has features and benefits such as, (1) it is applicable to the case in which past operation data for load prediction model learning is poor, (2) it has a self checking function, which always supervises if the data driven load prediction and the physical based one are consistent or not, so it can find if something is wrong in load prediction procedure, (3) it has ability to adjust load prediction in real-time against sudden change of model parameters and environmental conditions. The proposed method is evaluated with real operation data of an existing building, and the improvement of load prediction performance is illustrated.
Identifying Seizure Onset Zone From the Causal Connectivity Inferred Using Directed Information

NASA Astrophysics Data System (ADS)

Malladi, Rakesh; Kalamangalam, Giridhar; Tandon, Nitin; Aazhang, Behnaam

2016-10-01

In this paper, we developed a model-based and a data-driven estimator for directed information (DI) to infer the causal connectivity graph between electrocorticographic (ECoG) signals recorded from brain and to identify the seizure onset zone (SOZ) in epileptic patients. Directed information, an information theoretic quantity, is a general metric to infer causal connectivity between time-series and is not restricted to a particular class of models unlike the popular metrics based on Granger causality or transfer entropy. The proposed estimators are shown to be almost surely convergent. Causal connectivity between ECoG electrodes in five epileptic patients is inferred using the proposed DI estimators, after validating their performance on simulated data. We then proposed a model-based and a data-driven SOZ identification algorithm to identify SOZ from the causal connectivity inferred using model-based and data-driven DI estimators respectively. The data-driven SOZ identification outperforms the model-based SOZ identification algorithm when benchmarked against visual analysis by neurologist, the current clinical gold standard. The causal connectivity analysis presented here is the first step towards developing novel non-surgical treatments for epilepsy.
A Finite Element Model for Mixed Porohyperelasticity with Transport, Swelling, and Growth.

PubMed

Armstrong, Michelle Hine; Buganza Tepole, Adrián; Kuhl, Ellen; Simon, Bruce R; Vande Geest, Jonathan P

2016-01-01

The purpose of this manuscript is to establish a unified theory of porohyperelasticity with transport and growth and to demonstrate the capability of this theory using a finite element model developed in MATLAB. We combine the theories of volumetric growth and mixed porohyperelasticity with transport and swelling (MPHETS) to derive a new method that models growth of biological soft tissues. The conservation equations and constitutive equations are developed for both solid-only growth and solid/fluid growth. An axisymmetric finite element framework is introduced for the new theory of growing MPHETS (GMPHETS). To illustrate the capabilities of this model, several example finite element test problems are considered using model geometry and material parameters based on experimental data from a porcine coronary artery. Multiple growth laws are considered, including time-driven, concentration-driven, and stress-driven growth. Time-driven growth is compared against an exact analytical solution to validate the model. For concentration-dependent growth, changing the diffusivity (representing a change in drug) fundamentally changes growth behavior. We further demonstrate that for stress-dependent, solid-only growth of an artery, growth of an MPHETS model results in a more uniform hoop stress than growth in a hyperelastic model for the same amount of growth time using the same growth law. This may have implications in the context of developing residual stresses in soft tissues under intraluminal pressure. To our knowledge, this manuscript provides the first full description of an MPHETS model with growth. The developed computational framework can be used in concert with novel in-vitro and in-vivo experimental approaches to identify the governing growth laws for various soft tissues.
A Finite Element Model for Mixed Porohyperelasticity with Transport, Swelling, and Growth

PubMed Central

Armstrong, Michelle Hine; Buganza Tepole, Adrián; Kuhl, Ellen; Simon, Bruce R.; Vande Geest, Jonathan P.

2016-01-01

The purpose of this manuscript is to establish a unified theory of porohyperelasticity with transport and growth and to demonstrate the capability of this theory using a finite element model developed in MATLAB. We combine the theories of volumetric growth and mixed porohyperelasticity with transport and swelling (MPHETS) to derive a new method that models growth of biological soft tissues. The conservation equations and constitutive equations are developed for both solid-only growth and solid/fluid growth. An axisymmetric finite element framework is introduced for the new theory of growing MPHETS (GMPHETS). To illustrate the capabilities of this model, several example finite element test problems are considered using model geometry and material parameters based on experimental data from a porcine coronary artery. Multiple growth laws are considered, including time-driven, concentration-driven, and stress-driven growth. Time-driven growth is compared against an exact analytical solution to validate the model. For concentration-dependent growth, changing the diffusivity (representing a change in drug) fundamentally changes growth behavior. We further demonstrate that for stress-dependent, solid-only growth of an artery, growth of an MPHETS model results in a more uniform hoop stress than growth in a hyperelastic model for the same amount of growth time using the same growth law. This may have implications in the context of developing residual stresses in soft tissues under intraluminal pressure. To our knowledge, this manuscript provides the first full description of an MPHETS model with growth. The developed computational framework can be used in concert with novel in-vitro and in-vivo experimental approaches to identify the governing growth laws for various soft tissues. PMID:27078495
A Model-Driven Architecture Approach for Modeling, Specifying and Deploying Policies in Autonomous and Autonomic Systems

NASA Technical Reports Server (NTRS)

Pena, Joaquin; Hinchey, Michael G.; Sterritt, Roy; Ruiz-Cortes, Antonio; Resinas, Manuel

2006-01-01

Autonomic Computing (AC), self-management based on high level guidance from humans, is increasingly gaining momentum as the way forward in designing reliable systems that hide complexity and conquer IT management costs. Effectively, AC may be viewed as Policy-Based Self-Management. The Model Driven Architecture (MDA) approach focuses on building models that can be transformed into code in an automatic manner. In this paper, we look at ways to implement Policy-Based Self-Management by means of models that can be converted to code using transformations that follow the MDA philosophy. We propose a set of UML-based models to specify autonomic and autonomous features along with the necessary procedures, based on modification and composition of models, to deploy a policy as an executing system.
Data-driven, Interpretable Photometric Redshifts Trained on Heterogeneous and Unrepresentative Data

NASA Astrophysics Data System (ADS)

Leistedt, Boris; Hogg, David W.

2017-03-01

We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux-redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training data or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the I-magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.
Dynamic simulation of storm-driven barrier island morphology under future sea level rise

NASA Astrophysics Data System (ADS)

Passeri, D. L.; Long, J.; Plant, N. G.; Bilskie, M. V.; Hagen, S. C.

2016-12-01

The impacts of short-term processes such as tropical and extratropical storms have the potential to alter barrier island morphology. On the event scale, the effects of storm-driven morphology may result in damage or loss of property, infrastructure and habitat. On the decadal scale, the combination of storms and sea level rise (SLR) will evolve barrier islands. The effects of SLR on hydrodynamics and coastal morphology are dynamic and inter-related; nonlinearities in SLR can cause larger peak surges, lengthier inundation times and additional inundated land, which may result in increased erosion, overwash or breaching along barrier islands. This study uses a two-dimensional morphodynamic model (XBeach) to examine the response of Dauphin Island, AL to storm surge under future SLR. The model is forced with water levels and waves provided by a large-domain hydrodynamic model. A historic validation of hurricanes Ivan and Katrina indicates the model is capable of predicting morphologic response with high skill (0.5). The validated model is used to simulate storm surge driven by Ivan and Katrina under four future SLR scenarios, ranging from 20 cm to 2 m. Each SLR scenario is implemented using a static or "bathtub" approach (in which water levels are increased linearly by the amount of SLR) versus a dynamic approach (in which SLR is applied at the open ocean boundary of the hydrodynamic model and allowed to propagate through the domain as guided by the governing equations). Results illustrate that higher amounts of SLR result in additional shoreline change, dune erosion, overwash and breaching. Compared to the dynamic approach, the static approach over-predicts inundation, dune erosion, overwash and breaching of the island. Overall, results provide a better understanding of the effects of SLR on storm-driven barrier island morphology and support a paradigm shift away from the "bathtub" approach, towards considering the integrated, dynamic effects of SLR.
Lazy evaluation of FP programs: A data-flow approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wei, Y.H.; Gaudiot, J.L.

1988-12-31

This paper presents a lazy evaluation system for the list-based functional language, Backus` FP in data-driven environment. A superset language of FP, called DFP (Demand-driven FP), is introduced. FP eager programs are transformed into DFP lazy programs which contain the notions of demands. The data-driven execution of DFP programs has the same effects of lazy evaluation. DFP lazy programs have the property of always evaluating a sufficient and necessary result. The infinite sequence generator is used to demonstrate the eager-lazy program transformation and the execution of the lazy programs.
On Mixed Data and Event Driven Design for Adaptive-Critic-Based Nonlinear $H_{\\infty}$ Control.

PubMed

Wang, Ding; Mu, Chaoxu; Liu, Derong; Ma, Hongwen

2018-04-01

In this paper, based on the adaptive critic learning technique, the control for a class of unknown nonlinear dynamic systems is investigated by adopting a mixed data and event driven design approach. The nonlinear control problem is formulated as a two-player zero-sum differential game and the adaptive critic method is employed to cope with the data-based optimization. The novelty lies in that the data driven learning identifier is combined with the event driven design formulation, in order to develop the adaptive critic controller, thereby accomplishing the nonlinear control. The event driven optimal control law and the time driven worst case disturbance law are approximated by constructing and tuning a critic neural network. Applying the event driven feedback control, the closed-loop system is built with stability analysis. Simulation studies are conducted to verify the theoretical results and illustrate the control performance. It is significant to observe that the present research provides a new avenue of integrating data-based control and event-triggering mechanism into establishing advanced adaptive critic systems.
A Model-Driven Approach to Teaching Concurrency

ERIC Educational Resources Information Center

Carro, Manuel; Herranz, Angel; Marino, Julio

2013-01-01

We present an undergraduate course on concurrent programming where formal models are used in different stages of the learning process. The main practical difference with other approaches lies in the fact that the ability to develop correct concurrent software relies on a systematic transformation of formal models of inter-process interaction (so…
Creativity in tuberculosis research and discovery.

PubMed

Younga, Douglas; Verreck, Frank A W

2012-03-01

The remarkable advances in TB vaccinology over the last decade have been driven by a pragmatic approach to moving candidates along the development pipeline to clinical trials, fuelled by encouraging data on protection in animal models. Efficacy data from Phase IIb trials of the first generation of new candidates are anticipated over the next 1-2 years. As outlined in the TB Vaccines Strategic Blueprint, to exploit this information and to inspire design of next generation candidates, it is important that this empirical approach is complemented by progress in understanding of fundamental immune mechanisms and improved translational modalities. Current trends towards improved experimental and computational approaches for studying biological complexity will be an important element in the developing science of TB vaccinology. Copyright © 2012 Elsevier Ltd. All rights reserved.
First-principles data-driven discovery of transition metal oxides for artificial photosynthesis

NASA Astrophysics Data System (ADS)

Yan, Qimin

We develop a first-principles data-driven approach for rapid identification of transition metal oxide (TMO) light absorbers and photocatalysts for artificial photosynthesis using the Materials Project. Initially focusing on Cr, V, and Mn-based ternary TMOs in the database, we design a broadly-applicable multiple-layer screening workflow automating density functional theory (DFT) and hybrid functional calculations of bulk and surface electronic and magnetic structures. We further assess the electrochemical stability of TMOs in aqueous environments from computed Pourbaix diagrams. Several promising earth-abundant low band-gap TMO compounds with desirable band edge energies and electrochemical stability are identified by our computational efforts and then synergistically evaluated using high-throughput synthesis and photoelectrochemical screening techniques by our experimental collaborators at Caltech. Our joint theory-experiment effort has successfully identified new earth-abundant copper and manganese vanadate complex oxides that meet highly demanding requirements for photoanodes, substantially expanding the known space of such materials. By integrating theory and experiment, we validate our approach and develop important new insights into structure-property relationships for TMOs for oxygen evolution photocatalysts, paving the way for use of first-principles data-driven techniques in future applications. This work is supported by the Materials Project Predictive Modeling Center and the Joint Center for Artificial Photosynthesis through the U.S. Department of Energy, Office of Basic Energy Sciences, Materials Sciences and Engineering Division, under Contract No. DE-AC02-05CH11231. Computational resources also provided by the Department of Energy through the National Energy Supercomputing Center.

Wave Resource Characterization Using an Unstructured Grid Modeling Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Wei-Cheng; Yang, Zhaoqing; Wang, Taiping

This paper presents a modeling study conducted on the central Oregon coast for wave resource characterization using the unstructured-grid SWAN model coupled with a nested-grid WWIII model. The flexibility of models of various spatial resolutions and the effects of open- boundary conditions simulated by a nested-grid WWIII model with different physics packages were evaluated. The model results demonstrate the advantage of the unstructured-grid modeling approach for flexible model resolution and good model skills in simulating the six wave resource parameters recommended by the International Electrotechnical Commission in comparison to the observed data in Year 2009 at National Data Buoy Centermore » Buoy 46050. Notably, spectral analysis indicates that the ST4 physics package improves upon the model skill of the ST2 physics package for predicting wave power density for large waves, which is important for wave resource assessment, device load calculation, and risk management. In addition, bivariate distributions show the simulated sea state of maximum occurrence with the ST4 physics package matched the observed data better than that with the ST2 physics package. This study demonstrated that the unstructured-grid wave modeling approach, driven by the nested-grid regional WWIII outputs with the ST4 physics package, can efficiently provide accurate wave hindcasts to support wave resource characterization. Our study also suggests that wind effects need to be considered if the dimension of the model domain is greater than approximately 100 km, or O (10^2 km).« less
Thoracic respiratory motion estimation from MRI using a statistical model and a 2-D image navigator.

PubMed

King, A P; Buerger, C; Tsoumpas, C; Marsden, P K; Schaeffter, T

2012-01-01

Respiratory motion models have potential application for estimating and correcting the effects of motion in a wide range of applications, for example in PET-MR imaging. Given that motion cycles caused by breathing are only approximately repeatable, an important quality of such models is their ability to capture and estimate the intra- and inter-cycle variability of the motion. In this paper we propose and describe a technique for free-form nonrigid respiratory motion correction in the thorax. Our model is based on a principal component analysis of the motion states encountered during different breathing patterns, and is formed from motion estimates made from dynamic 3-D MRI data. We apply our model using a data-driven technique based on a 2-D MRI image navigator. Unlike most previously reported work in the literature, our approach is able to capture both intra- and inter-cycle motion variability. In addition, the 2-D image navigator can be used to estimate how applicable the current motion model is, and hence report when more imaging data is required to update the model. We also use the motion model to decide on the best positioning for the image navigator. We validate our approach using MRI data acquired from 10 volunteers and demonstrate improvements of up to 40.5% over other reported motion modelling approaches, which corresponds to 61% of the overall respiratory motion present. Finally we demonstrate one potential application of our technique: MRI-based motion correction of real-time PET data for simultaneous PET-MRI acquisition. Copyright © 2011 Elsevier B.V. All rights reserved.
Inferring Muscle-Tendon Unit Power from Ankle Joint Power during the Push-Off Phase of Human Walking: Insights from a Multiarticular EMG-Driven Model

PubMed Central

2016-01-01

Introduction Inverse dynamics joint kinetics are often used to infer contributions from underlying groups of muscle-tendon units (MTUs). However, such interpretations are confounded by multiarticular (multi-joint) musculature, which can cause inverse dynamics to over- or under-estimate net MTU power. Misestimation of MTU power could lead to incorrect scientific conclusions, or to empirical estimates that misguide musculoskeletal simulations, assistive device designs, or clinical interventions. The objective of this study was to investigate the degree to which ankle joint power overestimates net plantarflexor MTU power during the Push-off phase of walking, due to the behavior of the flexor digitorum and hallucis longus (FDHL)–multiarticular MTUs crossing the ankle and metatarsophalangeal (toe) joints. Methods We performed a gait analysis study on six healthy participants, recording ground reaction forces, kinematics, and electromyography (EMG). Empirical data were input into an EMG-driven musculoskeletal model to estimate ankle power. This model enabled us to parse contributions from mono- and multi-articular MTUs, and required only one scaling and one time delay factor for each subject and speed, which were solved for based on empirical data. Net plantarflexing MTU power was computed by the model and quantitatively compared to inverse dynamics ankle power. Results The EMG-driven model was able to reproduce inverse dynamics ankle power across a range of gait speeds (R2 ≥ 0.97), while also providing MTU-specific power estimates. We found that FDHL dynamics caused ankle power to slightly overestimate net plantarflexor MTU power, but only by ~2–7%. Conclusions During Push-off, FDHL MTU dynamics do not substantially confound the inference of net plantarflexor MTU power from inverse dynamics ankle power. However, other methodological limitations may cause inverse dynamics to overestimate net MTU power; for instance, due to rigid-body foot assumptions. Moving forward, the EMG-driven modeling approach presented could be applied to understand other tasks or larger multiarticular MTUs. PMID:27764110
Inferring Muscle-Tendon Unit Power from Ankle Joint Power during the Push-Off Phase of Human Walking: Insights from a Multiarticular EMG-Driven Model.

PubMed

Honert, Eric C; Zelik, Karl E

2016-01-01

Inverse dynamics joint kinetics are often used to infer contributions from underlying groups of muscle-tendon units (MTUs). However, such interpretations are confounded by multiarticular (multi-joint) musculature, which can cause inverse dynamics to over- or under-estimate net MTU power. Misestimation of MTU power could lead to incorrect scientific conclusions, or to empirical estimates that misguide musculoskeletal simulations, assistive device designs, or clinical interventions. The objective of this study was to investigate the degree to which ankle joint power overestimates net plantarflexor MTU power during the Push-off phase of walking, due to the behavior of the flexor digitorum and hallucis longus (FDHL)-multiarticular MTUs crossing the ankle and metatarsophalangeal (toe) joints. We performed a gait analysis study on six healthy participants, recording ground reaction forces, kinematics, and electromyography (EMG). Empirical data were input into an EMG-driven musculoskeletal model to estimate ankle power. This model enabled us to parse contributions from mono- and multi-articular MTUs, and required only one scaling and one time delay factor for each subject and speed, which were solved for based on empirical data. Net plantarflexing MTU power was computed by the model and quantitatively compared to inverse dynamics ankle power. The EMG-driven model was able to reproduce inverse dynamics ankle power across a range of gait speeds (R2 ≥ 0.97), while also providing MTU-specific power estimates. We found that FDHL dynamics caused ankle power to slightly overestimate net plantarflexor MTU power, but only by ~2-7%. During Push-off, FDHL MTU dynamics do not substantially confound the inference of net plantarflexor MTU power from inverse dynamics ankle power. However, other methodological limitations may cause inverse dynamics to overestimate net MTU power; for instance, due to rigid-body foot assumptions. Moving forward, the EMG-driven modeling approach presented could be applied to understand other tasks or larger multiarticular MTUs.
An Approach To Using All Location Tagged Numerical Data Sets As Continuous Fields With User-Assigned Continuity As A Basis For User-Driven Data Assimilation

NASA Astrophysics Data System (ADS)

Vernon, F.; Arrott, M.; Orcutt, J. A.; Mueller, C.; Case, J.; De Wardener, G.; Kerfoot, J.; Schofield, O.

2013-12-01

Any approach sophisticated enough to handle a variety of data sources and scale, yet easy enough to promote wide use and mainstream adoption is required to address the following mappings: - From the authored domain of observation to the requested domain of interest; - From the authored spatiotemporal resolution to the requested resolution; and - From the representation of data placed on wide variety of discrete mesh types to the use of that data as a continuos field with a selectable continuity. The Open Geospatial Consortium's (OGC) Reference Model[1] with its direct association with the ISO 19000 series standards provides a comprehensive foundation to represent all data on any type of mesh structure, aka "Discrete Coverages". The Reference Model also provides the specification for the core operations required to utilize any Discrete Coverage. The FEniCS Project[2] provides a comprehensive model for how to represent the Basis Functions on mesh structures as "Degrees of Freedom" to present discrete data as continuous fields with variable continuity. In this talk, we will present the research and development the OOI Cyberinfrastructure Project is pursuing to integrate these approaches into a comprehensive Application Programming Interface (API) to author, acquire and operate on the broad range of data formulation from time series, trajectories and tables through to time variant finite difference grids and finite element meshes.
A time domain frequency-selective multivariate Granger causality approach.

PubMed

Leistritz, Lutz; Witte, Herbert

2016-08-01

The investigation of effective connectivity is one of the major topics in computational neuroscience to understand the interaction between spatially distributed neuronal units of the brain. Thus, a wide variety of methods has been developed during the last decades to investigate functional and effective connectivity in multivariate systems. Their spectrum ranges from model-based to model-free approaches with a clear separation into time and frequency range methods. We present in this simulation study a novel time domain approach based on Granger's principle of predictability, which allows frequency-selective considerations of directed interactions. It is based on a comparison of prediction errors of multivariate autoregressive models fitted to systematically modified time series. These modifications are based on signal decompositions, which enable a targeted cancellation of specific signal components with specific spectral properties. Depending on the embedded signal decomposition method, a frequency-selective or data-driven signal-adaptive Granger Causality Index may be derived.
Model of mobile agents for sexual interactions networks

NASA Astrophysics Data System (ADS)

González, M. C.; Lind, P. G.; Herrmann, H. J.

2006-02-01

We present a novel model to simulate real social networks of complex interactions, based in a system of colliding particles (agents). The network is build by keeping track of the collisions and evolves in time with correlations which emerge due to the mobility of the agents. Therefore, statistical features are a consequence only of local collisions among its individual agents. Agent dynamics is realized by an event-driven algorithm of collisions where energy is gained as opposed to physical systems which have dissipation. The model reproduces empirical data from networks of sexual interactions, not previously obtained with other approaches.
Onboard Science Insights and Vehicle Dynamics from Scale-Model Trials of the Titan Mare Explorer (TiME) Capsule at Laguna Negra, Chile.

PubMed

Lorenz, Ralph D; Cabrol, Nathalie A

2018-05-01

A scale model of the proposed Titan Mare Explorer capsule was deployed at the Planetary Lake Lander field site at Laguna Negra, Chile. The tests served to calibrate models of wind-driven drift of the capsule and to understand its attitude motion in the wave field, as well as to identify dynamic and acoustic signatures of shoreline approach. This information enables formulation of onboard trigger criteria for near-shore science data acquisition. Key Words: Titan-Vehicle dynamics-Science autonomy-Lake. Astrobiology 18, 607-618.
Application of statistical mining in healthcare data management for allergic diseases

NASA Astrophysics Data System (ADS)

Wawrzyniak, Zbigniew M.; Martínez Santolaya, Sara

2014-11-01

The paper aims to discuss data mining techniques based on statistical tools in medical data management in case of long-term diseases. The data collected from a population survey is the source for reasoning and identifying disease processes responsible for patient's illness and its symptoms, and prescribing a knowledge and decisions in course of action to correct patient's condition. The case considered as a sample of constructive approach to data management is a dependence of allergic diseases of chronic nature on some symptoms and environmental conditions. The knowledge summarized in a systematic way as accumulated experience constitutes to an experiential simplified model of the diseases with feature space constructed of small set of indicators. We have presented the model of disease-symptom-opinion with knowledge discovery for data management in healthcare. The feature is evident that the model is purely data-driven to evaluate the knowledge of the diseases` processes and probability dependence of future disease events on symptoms and other attributes. The example done from the outcomes of the survey of long-term (chronic) disease shows that a small set of core indicators as 4 or more symptoms and opinions could be very helpful in reflecting health status change over disease causes. Furthermore, the data driven understanding of the mechanisms of diseases gives physicians the basis for choices of treatment what outlines the need of data governance in this research domain of discovered knowledge from surveys.
Data-assisted reduced-order modeling of extreme events in complex dynamical systems

PubMed Central

Koumoutsakos, Petros

2018-01-01

The prediction of extreme events, from avalanches and droughts to tsunamis and epidemics, depends on the formulation and analysis of relevant, complex dynamical systems. Such dynamical systems are characterized by high intrinsic dimensionality with extreme events having the form of rare transitions that are several standard deviations away from the mean. Such systems are not amenable to classical order-reduction methods through projection of the governing equations due to the large intrinsic dimensionality of the underlying attractor as well as the complexity of the transient events. Alternatively, data-driven techniques aim to quantify the dynamics of specific, critical modes by utilizing data-streams and by expanding the dimensionality of the reduced-order model using delayed coordinates. In turn, these methods have major limitations in regions of the phase space with sparse data, which is the case for extreme events. In this work, we develop a novel hybrid framework that complements an imperfect reduced order model, with data-streams that are integrated though a recurrent neural network (RNN) architecture. The reduced order model has the form of projected equations into a low-dimensional subspace that still contains important dynamical information about the system and it is expanded by a long short-term memory (LSTM) regularization. The LSTM-RNN is trained by analyzing the mismatch between the imperfect model and the data-streams, projected to the reduced-order space. The data-driven model assists the imperfect model in regions where data is available, while for locations where data is sparse the imperfect model still provides a baseline for the prediction of the system state. We assess the developed framework on two challenging prototype systems exhibiting extreme events. We show that the blended approach has improved performance compared with methods that use either data streams or the imperfect model alone. Notably the improvement is more significant in regions associated with extreme events, where data is sparse. PMID:29795631
Data-assisted reduced-order modeling of extreme events in complex dynamical systems.

PubMed

Wan, Zhong Yi; Vlachas, Pantelis; Koumoutsakos, Petros; Sapsis, Themistoklis

2018-01-01

The prediction of extreme events, from avalanches and droughts to tsunamis and epidemics, depends on the formulation and analysis of relevant, complex dynamical systems. Such dynamical systems are characterized by high intrinsic dimensionality with extreme events having the form of rare transitions that are several standard deviations away from the mean. Such systems are not amenable to classical order-reduction methods through projection of the governing equations due to the large intrinsic dimensionality of the underlying attractor as well as the complexity of the transient events. Alternatively, data-driven techniques aim to quantify the dynamics of specific, critical modes by utilizing data-streams and by expanding the dimensionality of the reduced-order model using delayed coordinates. In turn, these methods have major limitations in regions of the phase space with sparse data, which is the case for extreme events. In this work, we develop a novel hybrid framework that complements an imperfect reduced order model, with data-streams that are integrated though a recurrent neural network (RNN) architecture. The reduced order model has the form of projected equations into a low-dimensional subspace that still contains important dynamical information about the system and it is expanded by a long short-term memory (LSTM) regularization. The LSTM-RNN is trained by analyzing the mismatch between the imperfect model and the data-streams, projected to the reduced-order space. The data-driven model assists the imperfect model in regions where data is available, while for locations where data is sparse the imperfect model still provides a baseline for the prediction of the system state. We assess the developed framework on two challenging prototype systems exhibiting extreme events. We show that the blended approach has improved performance compared with methods that use either data streams or the imperfect model alone. Notably the improvement is more significant in regions associated with extreme events, where data is sparse.
LinkIT: a ludic elicitation game for eliciting risk perceptions.

PubMed

Cao, Yan; McGill, William L

2013-06-01

The mental models approach, a leading strategy to develop risk communications, involves a time- and labor-intensive interview process and a lengthy questionnaire to elicit group-level risk perceptions. We propose that a similarity ratings approach for structural knowledge elicitation can be adopted to assist the risk mental models approach. The LinkIT game, inspired by games with a purpose (GWAP) technology, is a ludic elicitation tool designed to elicit group understanding of the relations between risk factors in a more enjoyable and productive manner when compared to traditional approaches. That is, consistent with the idea of ludic elicitation, LinkIT was designed to make the elicitation process fun and enjoyable in the hopes of increasing participation and data quality in risk studies. Like the mental models approach, the group mental model obtained via the LinkIT game can hence be generated and represented in a form of influence diagrams. In order to examine the external validity of LinkIT, we conducted a study to compare its performance with respect to a more conventional questionnaire-driven approach. Data analysis results conclude that the two group mental models elicited from the two approaches are similar to an extent. Yet, LinkIT was more productive and enjoyable than the questionnaire. However, participants commented that the current game has some usability concerns. This presentation summarizes the design and evaluation of the LinkIT game and suggests areas for future work. © 2012 Society for Risk Analysis.
Multilevel functional genomics data integration as a tool for understanding physiology: a network biology perspective.

PubMed

Davidsen, Peter K; Turan, Nil; Egginton, Stuart; Falciani, Francesco

2016-02-01

The overall aim of physiological research is to understand how living systems function in an integrative manner. Consequently, the discipline of physiology has since its infancy attempted to link multiple levels of biological organization. Increasingly this has involved mathematical and computational approaches, typically to model a small number of components spanning several levels of biological organization. With the advent of "omics" technologies, which can characterize the molecular state of a cell or tissue (intended as the level of expression and/or activity of its molecular components), the number of molecular components we can quantify has increased exponentially. Paradoxically, the unprecedented amount of experimental data has made it more difficult to derive conceptual models underlying essential mechanisms regulating mammalian physiology. We present an overview of state-of-the-art methods currently used to identifying biological networks underlying genomewide responses. These are based on a data-driven approach that relies on advanced computational methods designed to "learn" biology from observational data. In this review, we illustrate an application of these computational methodologies using a case study integrating an in vivo model representing the transcriptional state of hypoxic skeletal muscle with a clinical study representing muscle wasting in chronic obstructive pulmonary disease patients. The broader application of these approaches to modeling multiple levels of biological data in the context of modern physiology is discussed. Copyright © 2016 the American Physiological Society.
A method for using real world data in breast cancer modeling.

PubMed

Pobiruchin, Monika; Bochum, Sylvia; Martens, Uwe M; Kieser, Meinhard; Schramm, Wendelin

2016-04-01

Today, hospitals and other health care-related institutions are accumulating a growing bulk of real world clinical data. Such data offer new possibilities for the generation of disease models for the health economic evaluation. In this article, we propose a new approach to leverage cancer registry data for the development of Markov models. Records of breast cancer patients from a clinical cancer registry were used to construct a real world data driven disease model. We describe a model generation process which maps database structures to disease state definitions based on medical expert knowledge. Software was programmed in Java to automatically derive a model structure and transition probabilities. We illustrate our method with the reconstruction of a published breast cancer reference model derived primarily from clinical study data. In doing so, we exported longitudinal patient data from a clinical cancer registry covering eight years. The patient cohort (n=892) comprised HER2-positive and HER2-negative women treated with or without Trastuzumab. The models generated with this method for the respective patient cohorts were comparable to the reference model in their structure and treatment effects. However, our computed disease models reflect a more detailed picture of the transition probabilities, especially for disease free survival and recurrence. Our work presents an approach to extract Markov models semi-automatically using real world data from a clinical cancer registry. Health care decision makers may benefit from more realistic disease models to improve health care-related planning and actions based on their own data. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Crowd-Sourcing Management Activity Data to Drive GHG Emission Inventories in the Land Use Sector

NASA Astrophysics Data System (ADS)

Paustian, K.; Herrick, J.

2015-12-01

Greenhouse gas (GHG) emissions from the land use sector constitute the largest source category for many countries in Africa. Enhancing C sequestration and reducing GHG emissions on managed lands in Africa has to potential to attract C financing to support adoption of more sustainable land management practices that, in addition to GHG mitigation, can provide co-benefits of more productive and climate-resilient agroecosystems. However, robust systems to measure and monitor C sequestration/GHG reductions are currently a significant barrier to attracting more C financing to land use-related mitigation efforts.Anthropogenic GHG emissions are driven by a variety of environmental factors, including climate and soil attributes, as well as human-activities in the form of land use and management practices. GHG emission inventories typically use empirical or process-based models of emission rates that are driven by environmental and management variables. While a lack of field-based flux and C stock measurements are a limiting factor for GHG estimation, we argue that an even greater limitation may be availabiity of data on the management activities that influence flux rates, particularly in developing countries in Africa. In most developed countries there is a well-developed infrastructure of agricultural statistics and practice surveys that can be used to drive model-based GHG emission estimations. However, this infrastructure is largely lacking in developing countries in Africa. While some activity data (e.g. land cover change) can be derived from remote sensing, many key data (e.g., N fertilizer practices, residue management, manuring) require input from the farmers themselves. The explosive growth in cellular technology, even in many of the poorest parts of Africa, suggests the potential for a new crowd-sourcing approach and direct engagement with farmers to 'leap-frog' the land resource information model of developed countries. Among the many benefits of this approach would be high resolution management data to support GHG inventories at multiple scales. We present an overall conceptual model for this approach and examples from on-going projects in Africa employing direct farmer engagement, cellular technology and apps to develop this information resource.
SaLEM (v1.0) - the Soil and Landscape Evolution Model (SaLEM) for simulation of regolith depth in periglacial environments

NASA Astrophysics Data System (ADS)

Bock, Michael; Conrad, Olaf; Günther, Andreas; Gehrt, Ernst; Baritz, Rainer; Böhner, Jürgen

2018-04-01

We propose the implementation of the Soil and Landscape Evolution Model (SaLEM) for the spatiotemporal investigation of soil parent material evolution following a lithologically differentiated approach. Relevant parts of the established Geomorphic/Orogenic Landscape Evolution Model (GOLEM) have been adapted for an operational Geographical Information System (GIS) tool within the open-source software framework System for Automated Geoscientific Analyses (SAGA), thus taking advantage of SAGA's capabilities for geomorphometric analyses. The model is driven by palaeoclimatic data (temperature, precipitation) representative of periglacial areas in northern Germany over the last 50 000 years. The initial conditions have been determined for a test site by a digital terrain model and a geological model. Weathering, erosion and transport functions are calibrated using extrinsic (climatic) and intrinsic (lithologic) parameter data. First results indicate that our differentiated SaLEM approach shows some evidence for the spatiotemporal prediction of important soil parental material properties (particularly its depth). Future research will focus on the validation of the results against field data, and the influence of discrete events (mass movements, floods) on soil parent material formation has to be evaluated.
Inferring Ice Thickness from a Glacier Dynamics Model and Multiple Surface Datasets.

NASA Astrophysics Data System (ADS)

Guan, Y.; Haran, M.; Pollard, D.

2017-12-01

The future behavior of the West Antarctic Ice Sheet (WAIS) may have a major impact on future climate. For instance, ice sheet melt may contribute significantly to global sea level rise. Understanding the current state of WAIS is therefore of great interest. WAIS is drained by fast-flowing glaciers which are major contributors to ice loss. Hence, understanding the stability and dynamics of glaciers is critical for predicting the future of the ice sheet. Glacier dynamics are driven by the interplay between the topography, temperature and basal conditions beneath the ice. A glacier dynamics model describes the interactions between these processes. We develop a hierarchical Bayesian model that integrates multiple ice sheet surface data sets with a glacier dynamics model. Our approach allows us to (1) infer important parameters describing the glacier dynamics, (2) learn about ice sheet thickness, and (3) account for errors in the observations and the model. Because we have relatively dense and accurate ice thickness data from the Thwaites Glacier in West Antarctica, we use these data to validate the proposed approach. The long-term goal of this work is to have a general model that may be used to study multiple glaciers in the Antarctic.
On the Conditioning of Machine-Learning-Assisted Turbulence Modeling

NASA Astrophysics Data System (ADS)

Wu, Jinlong; Sun, Rui; Wang, Qiqi; Xiao, Heng

2017-11-01

Recently, several researchers have demonstrated that machine learning techniques can be used to improve the RANS modeled Reynolds stress by training on available database of high fidelity simulations. However, obtaining improved mean velocity field remains an unsolved challenge, restricting the predictive capability of current machine-learning-assisted turbulence modeling approaches. In this work we define a condition number to evaluate the model conditioning of data-driven turbulence modeling approaches, and propose a stability-oriented machine learning framework to model Reynolds stress. Two canonical flows, the flow in a square duct and the flow over periodic hills, are investigated to demonstrate the predictive capability of the proposed framework. The satisfactory prediction performance of mean velocity field for both flows demonstrates the predictive capability of the proposed framework for machine-learning-assisted turbulence modeling. With showing the capability of improving the prediction of mean flow field, the proposed stability-oriented machine learning framework bridges the gap between the existing machine-learning-assisted turbulence modeling approaches and the demand of predictive capability of turbulence models in real applications.
A metadata-driven approach to data repository design.

PubMed

Harvey, Matthew J; McLean, Andrew; Rzepa, Henry S

2017-01-01

The design and use of a metadata-driven data repository for research data management is described. Metadata is collected automatically during the submission process whenever possible and is registered with DataCite in accordance with their current metadata schema, in exchange for a persistent digital object identifier. Two examples of data preview are illustrated, including the demonstration of a method for integration with commercial software that confers rich domain-specific data analytics without introducing customisation into the repository itself.
Systems metabolic engineering: genome-scale models and beyond.

PubMed

Blazeck, John; Alper, Hal

2010-07-01

The advent of high throughput genome-scale bioinformatics has led to an exponential increase in available cellular system data. Systems metabolic engineering attempts to use data-driven approaches--based on the data collected with high throughput technologies--to identify gene targets and optimize phenotypical properties on a systems level. Current systems metabolic engineering tools are limited for predicting and defining complex phenotypes such as chemical tolerances and other global, multigenic traits. The most pragmatic systems-based tool for metabolic engineering to arise is the in silico genome-scale metabolic reconstruction. This tool has seen wide adoption for modeling cell growth and predicting beneficial gene knockouts, and we examine here how this approach can be expanded for novel organisms. This review will highlight advances of the systems metabolic engineering approach with a focus on de novo development and use of genome-scale metabolic reconstructions for metabolic engineering applications. We will then discuss the challenges and prospects for this emerging field to enable model-based metabolic engineering. Specifically, we argue that current state-of-the-art systems metabolic engineering techniques represent a viable first step for improving product yield that still must be followed by combinatorial techniques or random strain mutagenesis to achieve optimal cellular systems.

Robust and efficient estimation with weighted composite quantile regression

NASA Astrophysics Data System (ADS)

Jiang, Xuejun; Li, Jingzhi; Xia, Tian; Yan, Wanfeng

2016-09-01

In this paper we introduce a weighted composite quantile regression (CQR) estimation approach and study its application in nonlinear models such as exponential models and ARCH-type models. The weighted CQR is augmented by using a data-driven weighting scheme. With the error distribution unspecified, the proposed estimators share robustness from quantile regression and achieve nearly the same efficiency as the oracle maximum likelihood estimator (MLE) for a variety of error distributions including the normal, mixed-normal, Student's t, Cauchy distributions, etc. We also suggest an algorithm for the fast implementation of the proposed methodology. Simulations are carried out to compare the performance of different estimators, and the proposed approach is used to analyze the daily S&P 500 Composite index, which verifies the effectiveness and efficiency of our theoretical results.
On the derivation of a simple dynamic model of anaerobic digestion including the evolution of hydrogen.

PubMed

Giovannini, Giannina; Sbarciog, Mihaela; Steyer, Jean-Philippe; Chamy, Rolando; Vande Wouwer, Alain

2018-05-01

Hydrogen has been found to be an important intermediate during anaerobic digestion (AD) and a key variable for process monitoring as it gives valuable information about the stability of the reactor. However, simple dynamic models describing the evolution of hydrogen are not commonplace. In this work, such a dynamic model is derived using a systematic data driven-approach, which consists of a principal component analysis to deduce the dimension of the minimal reaction subspace explaining the data, followed by an identification of the kinetic parameters in the least-squares sense. The procedure requires the availability of informative data sets. When the available data does not fulfill this condition, the model can still be built from simulated data, obtained using a detailed model such as ADM1. This dynamic model could be exploited in monitoring and control applications after a re-identification of the parameters using actual process data. As an example, the model is used in the framework of a control strategy, and is also fitted to experimental data from raw industrial wine processing wastewater. Copyright © 2018 Elsevier Ltd. All rights reserved.
Data-driven modeling of sleep EEG and EOG reveals characteristics indicative of pre-Parkinson's and Parkinson's disease.

PubMed

Christensen, Julie A E; Zoetmulder, Marielle; Koch, Henriette; Frandsen, Rune; Arvastson, Lars; Christensen, Søren R; Jennum, Poul; Sorensen, Helge B D

2014-09-30

Manual scoring of sleep relies on identifying certain characteristics in polysomnograph (PSG) signals. However, these characteristics are disrupted in patients with neurodegenerative diseases. This study evaluates sleep using a topic modeling and unsupervised learning approach to identify sleep topics directly from electroencephalography (EEG) and electrooculography (EOG). PSG data from control subjects were used to develop an EOG and an EEG topic model. The models were applied to PSG data from 23 control subjects, 25 patients with periodic leg movements (PLMs), 31 patients with idiopathic REM sleep behavior disorder (iRBD) and 36 patients with Parkinson's disease (PD). The data were divided into training and validation datasets and features reflecting EEG and EOG characteristics based on topics were computed. The most discriminative feature subset for separating iRBD/PD and PLM/controls was estimated using a Lasso-regularized regression model. The features with highest discriminability were the number and stability of EEG topics linked to REM and N3, respectively. Validation of the model indicated a sensitivity of 91.4% and a specificity of 68.8% when classifying iRBD/PD patients. The topics showed visual accordance with the manually scored sleep stages, and the features revealed sleep characteristics containing information indicative of neurodegeneration. This study suggests that the amount of N3 and the ability to maintain NREM and REM sleep have potential as early PD biomarkers. Data-driven analysis of sleep may contribute to the evaluation of neurodegenerative patients. Copyright © 2014 Elsevier B.V. All rights reserved.
Modelling of the mercury loss in fluorescent lamps under the influence of metal oxide coatings

NASA Astrophysics Data System (ADS)

Santos Abreu, A.; Mayer, J.; Lenk, D.; Horn, S.; Konrad, A.; Tidecks, R.

2016-11-01

The mercury transport and loss mechanisms in the metal oxide coatings of mercury low pressure discharge fluorescent lamps have been investigated. An existing model based on a ballistic process is discussed in the context of experimental mercury loss data. Two different approaches to the modeling of the mercury loss have been developed. The first one is based on mercury transition rates between the plasma, the coating, and the glass without specifying the underlying physical processes. The second one is based on a transport process driven by diffusion and a binding process of mercury reacting to mercury oxide inside the layers. Moreover, we extended the diffusion based model to handle multi-component coatings. All approaches are applied to describe mercury loss experiments under the influence of an Al 2 O 3 coating.
Towards a Job Demands-Resources Health Model: Empirical Testing with Generalizable Indicators of Job Demands, Job Resources, and Comprehensive Health Outcomes

PubMed Central

Brauchli, Rebecca; Jenny, Gregor J.; Füllemann, Désirée; Bauer, Georg F.

2015-01-01

Studies using the Job Demands-Resources (JD-R) model commonly have a heterogeneous focus concerning the variables they investigate—selective job demands and resources as well as burnout and work engagement. The present study applies the rationale of the JD-R model to expand the relevant outcomes of job demands and job resources by linking the JD-R model to the logic of a generic health development framework predicting more broadly positive and negative health. The resulting JD-R health model was operationalized and tested with a generalizable set of job characteristics and positive and negative health outcomes among a heterogeneous sample of 2,159 employees. Applying a theory-driven and a data-driven approach, measures which were generally relevant for all employees were selected. Results from structural equation modeling indicated that the model fitted the data. Multiple group analyses indicated invariance across six organizations, gender, job positions, and three times of measurement. Initial evidence was found for the validity of an expanded JD-R health model. Thereby this study contributes to the current research on job characteristics and health by combining the core idea of the JD-R model with the broader concepts of salutogenic and pathogenic health development processes as well as both positive and negative health outcomes. PMID:26557718
Expanding the role of reactive transport models in critical zone processes

USGS Publications Warehouse

Li, Li; Maher, Kate; Navarre-Sitchler, Alexis; Druhan, Jennifer; Meile, Christof; Lawrence, Corey; Moore, Joel; Perdrial, Julia; Sullivan, Pamela; Thompson, Aaron; Jin, Lixin; Bolton, Edward W.; Brantley, Susan L.; Dietrich, William E.; Mayer, K. Ulrich; Steefel, Carl; Valocchi, Albert J.; Zachara, John M.; Kocar, Benjamin D.; McIntosh, Jennifer; Tutolo, Benjamin M.; Kumar, Mukesh; Sonnenthal, Eric; Bao, Chen; Beisman, Joe

2017-01-01

Models test our understanding of processes and can reach beyond the spatial and temporal scales of measurements. Multi-component Reactive Transport Models (RTMs), initially developed more than three decades ago, have been used extensively to explore the interactions of geothermal, hydrologic, geochemical, and geobiological processes in subsurface systems. Driven by extensive data sets now available from intensive measurement efforts, there is a pressing need to couple RTMs with other community models to explore non-linear interactions among the atmosphere, hydrosphere, biosphere, and geosphere. Here we briefly review the history of RTM development, summarize the current state of RTM approaches, and identify new research directions, opportunities, and infrastructure needs to broaden the use of RTMs. In particular, we envision the expanded use of RTMs in advancing process understanding in the Critical Zone, the veneer of the Earth that extends from the top of vegetation to the bottom of groundwater. We argue that, although parsimonious models are essential at larger scales, process-based models offer tools to explore the highly nonlinear coupling that characterizes natural systems. We present seven testable hypotheses that emphasize the unique capabilities of process-based RTMs for (1) elucidating chemical weathering and its physical and biogeochemical drivers; (2) understanding the interactions among roots, micro-organisms, carbon, water, and minerals in the rhizosphere; (3) assessing the effects of heterogeneity across spatial and temporal scales; and (4) integrating the vast quantity of novel data, including “omics” data (genomics, transcriptomics, proteomics, metabolomics), elemental concentration and speciation data, and isotope data into our understanding of complex earth surface systems. With strong support from data-driven sciences, we are now in an exciting era where integration of RTM framework into other community models will facilitate process understanding across disciplines and across scales.
Alaska/Yukon Geoid Improvement by a Data-Driven Stokes's Kernel Modification Approach

NASA Astrophysics Data System (ADS)

Li, Xiaopeng; Roman, Daniel R.

2015-04-01

Geoid modeling over Alaska of USA and Yukon Canada being a trans-national issue faces a great challenge primarily due to the inhomogeneous surface gravity data (Saleh et al, 2013) and the dynamic geology (Freymueller et al, 2008) as well as its complex geological rheology. Previous study (Roman and Li 2014) used updated satellite models (Bruinsma et al 2013) and newly acquired aerogravity data from the GRAV-D project (Smith 2007) to capture the gravity field changes in the targeting areas primarily in the middle-to-long wavelength. In CONUS, the geoid model was largely improved. However, the precision of the resulted geoid model in Alaska was still in the decimeter level, 19cm at the 32 tide bench marks and 24cm on the 202 GPS/Leveling bench marks that gives a total of 23.8cm at all of these calibrated surface control points, where the datum bias was removed. Conventional kernel modification methods in this area (Li and Wang 2011) had limited effects on improving the precision of the geoid models. To compensate the geoid miss fits, a new Stokes's kernel modification method based on a data-driven technique is presented in this study. First, the method was tested on simulated data sets (Fig. 1), where the geoid errors have been reduced by 2 orders of magnitude (Fig 2). For the real data sets, some iteration steps are required to overcome the rank deficiency problem caused by the limited control data that are irregularly distributed in the target area. For instance, after 3 iterations, the standard deviation dropped about 2.7cm (Fig 3). Modification at other critical degrees can further minimize the geoid model miss fits caused either by the gravity error or the remaining datum error in the control points.
Re-assessing Present Day Global Mass Transport and Glacial Isostatic Adjustment From a Data Driven Approach

NASA Astrophysics Data System (ADS)

Wu, X.; Jiang, Y.; Simonsen, S.; van den Broeke, M. R.; Ligtenberg, S.; Kuipers Munneke, P.; van der Wal, W.; Vermeersen, B. L. A.

2017-12-01

Determining present-day mass transport (PDMT) is complicated by the fact that most observations contain signals from both present day ice melting and Glacial Isostatic Adjustment (GIA). Despite decades of progress in geodynamic modeling and new observations, significant uncertainties remain in both. The key to separate present-day ice mass change and signals from GIA is to include data of different physical characteristics. We designed an approach to separate PDMT and GIA signatures by estimating them simultaneously using globally distributed interdisciplinary data with distinct physical information and a dynamically constructed a priori GIA model. We conducted a high-resolution global reappraisal of present-day ice mass balance with focus on Earth's polar regions and its contribution to global sea-level rise using a combination of ICESat, GRACE gravity, surface geodetic velocity data, and an ocean bottom pressure model. Adding ice altimetry supplies critically needed dual data types over the interiors of ice covered regions to enhance separation of PDMT and GIA signatures, and achieve half an order of magnitude expected higher accuracies for GIA and consequently ice mass balance estimates. The global data based approach can adequately address issues of PDMT and GIA induced geocenter motion and long-wavelength signatures important for large areas such as Antarctica and global mean sea level. In conjunction with the dense altimetry data, we solved for PDMT coefficients up to degree and order 180 by using a higher-resolution GRACE data set, and a high-resolution a priori PDMT model that includes detailed geographic boundaries. The high-resolution approach solves the problem of multiple resolutions in various data types, greatly reduces aliased errors from a low-degree truncation, and at the same time, enhances separation of signatures from adjacent regions such as Greenland and Canadian Arctic territories.
A Web-Based Data-Querying Tool Based on Ontology-Driven Methodology and Flowchart-Based Model

PubMed Central

Ping, Xiao-Ou; Chung, Yufang; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

2013-01-01

Background Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. Objective The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. Methods The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. Results In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, “degree of liver damage,” “degree of liver damage when applying a mutually exclusive setting,” and “treatments for liver cancer”) was 100% for all four experiments (10 patients, 100 patients, 1000 patients, and 10,000 patients). Among the three measured query phases, (1) structured query language operations, (2) criteria verification, and (3) other, the first two had the longest execution time. Conclusions The ontology-driven FBDQM-based approach enriched the capabilities of the data-querying system. The adoption of the GLIF3.5 increased the potential for interoperability, shareability, and reusability of the query tasks. PMID:25600078
A data-driven approach for evaluating multi-modal therapy in traumatic brain injury

PubMed Central

Haefeli, Jenny; Ferguson, Adam R.; Bingham, Deborah; Orr, Adrienne; Won, Seok Joon; Lam, Tina I.; Shi, Jian; Hawley, Sarah; Liu, Jialing; Swanson, Raymond A.; Massa, Stephen M.

2017-01-01

Combination therapies targeting multiple recovery mechanisms have the potential for additive or synergistic effects, but experimental design and analyses of multimodal therapeutic trials are challenging. To address this problem, we developed a data-driven approach to integrate and analyze raw source data from separate pre-clinical studies and evaluated interactions between four treatments following traumatic brain injury. Histologic and behavioral outcomes were measured in 202 rats treated with combinations of an anti-inflammatory agent (minocycline), a neurotrophic agent (LM11A-31), and physical therapy consisting of assisted exercise with or without botulinum toxin-induced limb constraint. Data was curated and analyzed in a linked workflow involving non-linear principal component analysis followed by hypothesis testing with a linear mixed model. Results revealed significant benefits of the neurotrophic agent LM11A-31 on learning and memory outcomes after traumatic brain injury. In addition, modulations of LM11A-31 effects by co-administration of minocycline and by the type of physical therapy applied reached statistical significance. These results suggest a combinatorial effect of drug and physical therapy interventions that was not evident by univariate analysis. The study designs and analytic techniques applied here form a structured, unbiased, internally validated workflow that may be applied to other combinatorial studies, both in animals and humans. PMID:28205533
A data-driven approach for evaluating multi-modal therapy in traumatic brain injury.

PubMed

Haefeli, Jenny; Ferguson, Adam R; Bingham, Deborah; Orr, Adrienne; Won, Seok Joon; Lam, Tina I; Shi, Jian; Hawley, Sarah; Liu, Jialing; Swanson, Raymond A; Massa, Stephen M

2017-02-16

Combination therapies targeting multiple recovery mechanisms have the potential for additive or synergistic effects, but experimental design and analyses of multimodal therapeutic trials are challenging. To address this problem, we developed a data-driven approach to integrate and analyze raw source data from separate pre-clinical studies and evaluated interactions between four treatments following traumatic brain injury. Histologic and behavioral outcomes were measured in 202 rats treated with combinations of an anti-inflammatory agent (minocycline), a neurotrophic agent (LM11A-31), and physical therapy consisting of assisted exercise with or without botulinum toxin-induced limb constraint. Data was curated and analyzed in a linked workflow involving non-linear principal component analysis followed by hypothesis testing with a linear mixed model. Results revealed significant benefits of the neurotrophic agent LM11A-31 on learning and memory outcomes after traumatic brain injury. In addition, modulations of LM11A-31 effects by co-administration of minocycline and by the type of physical therapy applied reached statistical significance. These results suggest a combinatorial effect of drug and physical therapy interventions that was not evident by univariate analysis. The study designs and analytic techniques applied here form a structured, unbiased, internally validated workflow that may be applied to other combinatorial studies, both in animals and humans.
Data-Driven Residential Load Modeling and Validation in GridLAB-D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gotseff, Peter; Lundstrom, Blake

Accurately characterizing the impacts of high penetrations of distributed energy resources (DER) on the electric distribution system has driven modeling methods from traditional static snap shots, often representing a critical point in time (e.g., summer peak load), to quasi-static time series (QSTS) simulations capturing all the effects of variable DER, associated controls and hence, impacts on the distribution system over a given time period. Unfortunately, the high time resolution DER source and load data required for model inputs is often scarce or non-existent. This paper presents work performed within the GridLAB-D model environment to synthesize, calibrate, and validate 1-second residentialmore » load models based on measured transformer loads and physics-based models suitable for QSTS electric distribution system modeling. The modeling and validation approach taken was to create a typical GridLAB-D model home that, when replicated to represent multiple diverse houses on a single transformer, creates a statistically similar load to a measured load for a given weather input. The model homes are constructed to represent the range of actual homes on an instrumented transformer: square footage, thermal integrity, heating and cooling system definition as well as realistic occupancy schedules. House model calibration and validation was performed using the distribution transformer load data and corresponding weather. The modeled loads were found to be similar to the measured loads for four evaluation metrics: 1) daily average energy, 2) daily average and standard deviation of power, 3) power spectral density, and 4) load shape.« less
Simple Kinematic Pathway Approach (KPA) to Catchment-scale Travel Time and Water Age Distributions

NASA Astrophysics Data System (ADS)

Soltani, S. S.; Cvetkovic, V.; Destouni, G.

2017-12-01

The distribution of catchment-scale water travel times is strongly influenced by morphological dispersion and is partitioned between hillslope and larger, regional scales. We explore whether hillslope travel times are predictable using a simple semi-analytical "kinematic pathway approach" (KPA) that accounts for dispersion on two levels of morphological and macro-dispersion. The study gives new insights to shallow (hillslope) and deep (regional) groundwater travel times by comparing numerical simulations of travel time distributions, referred to as "dynamic model", with corresponding KPA computations for three different real catchment case studies in Sweden. KPA uses basic structural and hydrological data to compute transient water travel time (forward mode) and age (backward mode) distributions at the catchment outlet. Longitudinal and morphological dispersion components are reflected in KPA computations by assuming an effective Peclet number and topographically driven pathway length distributions, respectively. Numerical simulations of advective travel times are obtained by means of particle tracking using the fully-integrated flow model MIKE SHE. The comparison of computed cumulative distribution functions of travel times shows significant influence of morphological dispersion and groundwater recharge rate on the compatibility of the "kinematic pathway" and "dynamic" models. Zones of high recharge rate in "dynamic" models are associated with topographically driven groundwater flow paths to adjacent discharge zones, e.g. rivers and lakes, through relatively shallow pathway compartments. These zones exhibit more compatible behavior between "dynamic" and "kinematic pathway" models than the zones of low recharge rate. Interestingly, the travel time distributions of hillslope compartments remain almost unchanged with increasing recharge rates in the "dynamic" models. This robust "dynamic" model behavior suggests that flow path lengths and travel times in shallow hillslope compartments are controlled by topography, and therefore application and further development of the simple "kinematic pathway" approach is promising for their modeling.
Effects of a Data-Driven District-Level Reform Model

ERIC Educational Resources Information Center

Slavin, Robert E.; Holmes, GwenCarol; Madden, Nancy A.; Chamberlain, Anne; Cheung, Alan

2010-01-01

Despite a quarter-century of reform, US schools serving students in poverty continue to lag far behind other schools. There are proven programs, but these are not widely used. This large-scale experiment evaluated a district-level reform model created by the Center for DataDriven Reform in Education (CDDRE). The CDDRE model provided consultation…
Determination of the Parameter Sets for the Best Performance of IPS-driven ENLIL Model

NASA Astrophysics Data System (ADS)

Yun, Jongyeon; Choi, Kyu-Cheol; Yi, Jonghyuk; Kim, Jaehun; Odstrcil, Dusan

2016-12-01

Interplanetary scintillation-driven (IPS-driven) ENLIL model was jointly developed by University of California, San Diego (UCSD) and National Aeronaucics and Space Administration/Goddard Space Flight Center (NASA/GSFC). The model has been in operation by Korean Space Weather Cetner (KSWC) since 2014. IPS-driven ENLIL model has a variety of ambient solar wind parameters and the results of the model depend on the combination of these parameters. We have conducted researches to determine the best combination of parameters to improve the performance of the IPS-driven ENLIL model. The model results with input of 1,440 combinations of parameters are compared with the Advanced Composition Explorer (ACE) observation data. In this way, the top 10 parameter sets showing best performance were determined. Finally, the characteristics of the parameter sets were analyzed and application of the results to IPS-driven ENLIL model was discussed.
Data-driven diagnostics of terrestrial carbon dynamics over North America

Treesearch

Jingfeng Xiao; Scott V. Ollinger; Steve Frolking; George C. Hurtt; David Y. Hollinger; Kenneth J. Davis; Yude Pan; Xiaoyang Zhang; Feng Deng; Jiquan Chen; Dennis D. Baldocchi; Bevery E. Law; M. Altaf Arain; Ankur R. Desai; Andrew D. Richardson; Ge Sun; Brian Amiro; Hank Margolis; Lianhong Gu; Russell L. Scott; Peter D. Blanken; Andrew E. Suyker

2014-01-01

The exchange of carbon dioxide is a key measure of ecosystem metabolism and a critical intersection between the terrestrial biosphere and the Earth's climate. Despite the general agreement that the terrestrial ecosystems in North America provide a sizeable carbon sink, the size and distribution of the sink remain uncertain. We use a data-driven approach to upscale...
Lexical Awareness and Development through Data Driven Learning: Attitudes and Beliefs of EFL Learners

ERIC Educational Resources Information Center

Asik, Asuman; Vural, Arzu Sarlanoglu; Akpinar, Kadriye Dilek

2016-01-01

Data-driven learning (DDL) has become an innovative approach developed from corpus linguistics. It plays a significant role in the progression of foreign language pedagogy, since it offers learners plentiful authentic corpora examples that make them analyze language rules with the help of online corpora and concordancers. The present study…
Petri net model for analysis of concurrently processed complex algorithms

NASA Technical Reports Server (NTRS)

Stoughton, John W.; Mielke, Roland R.

1986-01-01

This paper presents a Petri-net model suitable for analyzing the concurrent processing of computationally complex algorithms. The decomposed operations are to be processed in a multiple processor, data driven architecture. Of particular interest is the application of the model to both the description of the data/control flow of a particular algorithm, and to the general specification of the data driven architecture. A candidate architecture is also presented.
User-driven Cloud Implementation of environmental models and data for all

NASA Astrophysics Data System (ADS)

Gurney, R. J.; Percy, B. J.; Elkhatib, Y.; Blair, G. S.

2014-12-01

Environmental data and models come from disparate sources over a variety of geographical and temporal scales with different resolutions and data standards, often including terabytes of data and model simulations. Unfortunately, these data and models tend to remain solely within the custody of the private and public organisations which create the data, and the scientists who build models and generate results. Although many models and datasets are theoretically available to others, the lack of ease of access tends to keep them out of reach of many. We have developed an intuitive web-based tool that utilises environmental models and datasets located in a cloud to produce results that are appropriate to the user. Storyboards showing the interfaces and visualisations have been created for each of several exemplars. A library of virtual machine images has been prepared to serve these exemplars. Each virtual machine image has been tailored to run computer models appropriate to the end user. Two approaches have been used; first as RESTful web services conforming to the Open Geospatial Consortium (OGC) Web Processing Service (WPS) interface standard using the Python-based PyWPS; second, a MySQL database interrogated using PHP code. In all cases, the web client sends the server an HTTP GET request to execute the process with a number of parameter values and, once execution terminates, an XML or JSON response is sent back and parsed at the client side to extract the results. All web services are stateless, i.e. application state is not maintained by the server, reducing its operational overheads and simplifying infrastructure management tasks such as load balancing and failure recovery. A hybrid cloud solution has been used with models and data sited on both private and public clouds. The storyboards have been transformed into intuitive web interfaces at the client side using HTML, CSS and JavaScript, utilising plug-ins such as jQuery and Flot (for graphics), and Google Maps APIs. We have demonstrated that a cloud infrastructure can be used to assemble a virtual research environment that, coupled with a user-driven development approach, is able to cater to the needs of a wide range of user groups, from domain experts to concerned members of the general public.
Elucidating the electron transport in semiconductors via Monte Carlo simulations: an inquiry-driven learning path for engineering undergraduates

NASA Astrophysics Data System (ADS)

Persano Adorno, Dominique; Pizzolato, Nicola; Fazio, Claudio

2015-09-01

Within the context of higher education for science or engineering undergraduates, we present an inquiry-driven learning path aimed at developing a more meaningful conceptual understanding of the electron dynamics in semiconductors in the presence of applied electric fields. The electron transport in a nondegenerate n-type indium phosphide bulk semiconductor is modelled using a multivalley Monte Carlo approach. The main characteristics of the electron dynamics are explored under different values of the driving electric field, lattice temperature and impurity density. Simulation results are presented by following a question-driven path of exploration, starting from the validation of the model and moving up to reasoned inquiries about the observed characteristics of electron dynamics. Our inquiry-driven learning path, based on numerical simulations, represents a viable example of how to integrate a traditional lecture-based teaching approach with effective learning strategies, providing science or engineering undergraduates with practical opportunities to enhance their comprehension of the physics governing the electron dynamics in semiconductors. Finally, we present a general discussion about the advantages and disadvantages of using an inquiry-based teaching approach within a learning environment based on semiconductor simulations.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.