Deconvolution of mixing time series on a graph
Blocker, Alexander W.; Airoldi, Edoardo M.
2013-01-01
In many applications we are interested in making inference on latent time series from indirect measurements, which are often low-dimensional projections resulting from mixing or aggregation. Positron emission tomography, super-resolution, and network traffic monitoring are some examples. Inference in such settings requires solving a sequence of ill-posed inverse problems, yt = Axt, where the projection mechanism provides information on A. We consider problems in which A specifies mixing on a graph of times series that are bursty and sparse. We develop a multilevel state-space model for mixing times series and an efficient approach to inference. A simple model is used to calibrate regularization parameters that lead to efficient inference in the multilevel state-space model. We apply this method to the problem of estimating point-to-point traffic flows on a network from aggregate measurements. Our solution outperforms existing methods for this problem, and our two-stage approach suggests an efficient inference strategy for multilevel models of multivariate time series. PMID:25309135
A New Modified Histogram Matching Normalization for Time Series Microarray Analysis.
Astola, Laura; Molenaar, Jaap
2014-07-01
Microarray data is often utilized in inferring regulatory networks. Quantile normalization (QN) is a popular method to reduce array-to-array variation. We show that in the context of time series measurements QN may not be the best choice for this task, especially not if the inference is based on continuous time ODE model. We propose an alternative normalization method that is better suited for network inference from time series data.
Inference of scale-free networks from gene expression time series.
Daisuke, Tominaga; Horton, Paul
2006-04-01
Quantitative time-series observation of gene expression is becoming possible, for example by cell array technology. However, there are no practical methods with which to infer network structures using only observed time-series data. As most computational models of biological networks for continuous time-series data have a high degree of freedom, it is almost impossible to infer the correct structures. On the other hand, it has been reported that some kinds of biological networks, such as gene networks and metabolic pathways, may have scale-free properties. We hypothesize that the architecture of inferred biological network models can be restricted to scale-free networks. We developed an inference algorithm for biological networks using only time-series data by introducing such a restriction. We adopt the S-system as the network model, and a distributed genetic algorithm to optimize models to fit its simulated results to observed time series data. We have tested our algorithm on a case study (simulated data). We compared optimization under no restriction, which allows for a fully connected network, and under the restriction that the total number of links must equal that expected from a scale free network. The restriction reduced both false positive and false negative estimation of the links and also the differences between model simulation and the given time-series data.
A New Modified Histogram Matching Normalization for Time Series Microarray Analysis
Astola, Laura; Molenaar, Jaap
2014-01-01
Microarray data is often utilized in inferring regulatory networks. Quantile normalization (QN) is a popular method to reduce array-to-array variation. We show that in the context of time series measurements QN may not be the best choice for this task, especially not if the inference is based on continuous time ODE model. We propose an alternative normalization method that is better suited for network inference from time series data. PMID:27600344
Wang, Yi Kan; Hurley, Daniel G.; Schnell, Santiago; Print, Cristin G.; Crampin, Edmund J.
2013-01-01
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data. PMID:23967277
Unraveling multiple changes in complex climate time series using Bayesian inference
NASA Astrophysics Data System (ADS)
Berner, Nadine; Trauth, Martin H.; Holschneider, Matthias
2016-04-01
Change points in time series are perceived as heterogeneities in the statistical or dynamical characteristics of observations. Unraveling such transitions yields essential information for the understanding of the observed system. The precise detection and basic characterization of underlying changes is therefore of particular importance in environmental sciences. We present a kernel-based Bayesian inference approach to investigate direct as well as indirect climate observations for multiple generic transition events. In order to develop a diagnostic approach designed to capture a variety of natural processes, the basic statistical features of central tendency and dispersion are used to locally approximate a complex time series by a generic transition model. A Bayesian inversion approach is developed to robustly infer on the location and the generic patterns of such a transition. To systematically investigate time series for multiple changes occurring at different temporal scales, the Bayesian inversion is extended to a kernel-based inference approach. By introducing basic kernel measures, the kernel inference results are composed into a proxy probability to a posterior distribution of multiple transitions. Thus, based on a generic transition model a probability expression is derived that is capable to indicate multiple changes within a complex time series. We discuss the method's performance by investigating direct and indirect climate observations. The approach is applied to environmental time series (about 100 a), from the weather station in Tuscaloosa, Alabama, and confirms documented instrumentation changes. Moreover, the approach is used to investigate a set of complex terrigenous dust records from the ODP sites 659, 721/722 and 967 interpreted as climate indicators of the African region of the Plio-Pleistocene period (about 5 Ma). The detailed inference unravels multiple transitions underlying the indirect climate observations coinciding with established global climate events.
A Review of Some Aspects of Robust Inference for Time Series.
1984-09-01
REVIEW OF SOME ASPECTSOF ROBUST INFERNCE FOR TIME SERIES by Ad . Dougla Main TE "iAL REPOW No. 63 Septermber 1984 Department of Statistics University of ...clear. One cannot hope to have a good method for dealing with outliers in time series by using only an instantaneous nonlinear transformation of the data...AI.49 716 A REVIEWd OF SOME ASPECTS OF ROBUST INFERENCE FOR TIME 1/1 SERIES(U) WASHINGTON UNIV SEATTLE DEPT OF STATISTICS R D MARTIN SEP 84 TR-53
Inference of Gene Regulatory Networks Using Time-Series Data: A Survey
Sima, Chao; Hua, Jianping; Jung, Sungwon
2009-01-01
The advent of high-throughput technology like microarrays has provided the platform for studying how different cellular components work together, thus created an enormous interest in mathematically modeling biological network, particularly gene regulatory network (GRN). Of particular interest is the modeling and inference on time-series data, which capture a more thorough picture of the system than non-temporal data do. We have given an extensive review of methodologies that have been used on time-series data. In realizing that validation is an impartible part of the inference paradigm, we have also presented a discussion on the principles and challenges in performance evaluation of different methods. This survey gives a panoramic view on these topics, with anticipation that the readers will be inspired to improve and/or expand GRN inference and validation tool repository. PMID:20190956
Genetic network inference as a series of discrimination tasks.
Kimura, Shuhei; Nakayama, Satoshi; Hatakeyama, Mariko
2009-04-01
Genetic network inference methods based on sets of differential equations generally require a great deal of time, as the equations must be solved many times. To reduce the computational cost, researchers have proposed other methods for inferring genetic networks by solving sets of differential equations only a few times, or even without solving them at all. When we try to obtain reasonable network models using these methods, however, we must estimate the time derivatives of the gene expression levels with great precision. In this study, we propose a new method to overcome the drawbacks of inference methods based on sets of differential equations. Our method infers genetic networks by obtaining classifiers capable of predicting the signs of the derivatives of the gene expression levels. For this purpose, we defined a genetic network inference problem as a series of discrimination tasks, then solved the defined series of discrimination tasks with a linear programming machine. Our experimental results demonstrated that the proposed method is capable of correctly inferring genetic networks, and doing so more than 500 times faster than the other inference methods based on sets of differential equations. Next, we applied our method to actual expression data of the bacterial SOS DNA repair system. And finally, we demonstrated that our approach relates to the inference method based on the S-system model. Though our method provides no estimation of the kinetic parameters, it should be useful for researchers interested only in the network structure of a target system. Supplementary data are available at Bioinformatics online.
Boolean network inference from time series data incorporating prior biological knowledge.
Haider, Saad; Pal, Ranadip
2012-01-01
Numerous approaches exist for modeling of genetic regulatory networks (GRNs) but the low sampling rates often employed in biological studies prevents the inference of detailed models from experimental data. In this paper, we analyze the issues involved in estimating a model of a GRN from single cell line time series data with limited time points. We present an inference approach for a Boolean Network (BN) model of a GRN from limited transcriptomic or proteomic time series data based on prior biological knowledge of connectivity, constraints on attractor structure and robust design. We applied our inference approach to 6 time point transcriptomic data on Human Mammary Epithelial Cell line (HMEC) after application of Epidermal Growth Factor (EGF) and generated a BN with a plausible biological structure satisfying the data. We further defined and applied a similarity measure to compare synthetic BNs and BNs generated through the proposed approach constructed from transitions of various paths of the synthetic BNs. We have also compared the performance of our algorithm with two existing BN inference algorithms. Through theoretical analysis and simulations, we showed the rarity of arriving at a BN from limited time series data with plausible biological structure using random connectivity and absence of structure in data. The framework when applied to experimental data and data generated from synthetic BNs were able to estimate BNs with high similarity scores. Comparison with existing BN inference algorithms showed the better performance of our proposed algorithm for limited time series data. The proposed framework can also be applied to optimize the connectivity of a GRN from experimental data when the prior biological knowledge on regulators is limited or not unique.
Exploratory Causal Analysis in Bivariate Time Series Data
NASA Astrophysics Data System (ADS)
McCracken, James M.
Many scientific disciplines rely on observational data of systems for which it is difficult (or impossible) to implement controlled experiments and data analysis techniques are required for identifying causal information and relationships directly from observational data. This need has lead to the development of many different time series causality approaches and tools including transfer entropy, convergent cross-mapping (CCM), and Granger causality statistics. In this thesis, the existing time series causality method of CCM is extended by introducing a new method called pairwise asymmetric inference (PAI). It is found that CCM may provide counter-intuitive causal inferences for simple dynamics with strong intuitive notions of causality, and the CCM causal inference can be a function of physical parameters that are seemingly unrelated to the existence of a driving relationship in the system. For example, a CCM causal inference might alternate between ''voltage drives current'' and ''current drives voltage'' as the frequency of the voltage signal is changed in a series circuit with a single resistor and inductor. PAI is introduced to address both of these limitations. Many of the current approaches in the times series causality literature are not computationally straightforward to apply, do not follow directly from assumptions of probabilistic causality, depend on assumed models for the time series generating process, or rely on embedding procedures. A new approach, called causal leaning, is introduced in this work to avoid these issues. The leaning is found to provide causal inferences that agree with intuition for both simple systems and more complicated empirical examples, including space weather data sets. The leaning may provide a clearer interpretation of the results than those from existing time series causality tools. A practicing analyst can explore the literature to find many proposals for identifying drivers and causal connections in times series data sets, but little research exists of how these tools compare to each other in practice. This work introduces and defines exploratory causal analysis (ECA) to address this issue along with the concept of data causality in the taxonomy of causal studies introduced in this work. The motivation is to provide a framework for exploring potential causal structures in time series data sets. ECA is used on several synthetic and empirical data sets, and it is found that all of the tested time series causality tools agree with each other (and intuitive notions of causality) for many simple systems but can provide conflicting causal inferences for more complicated systems. It is proposed that such disagreements between different time series causality tools during ECA might provide deeper insight into the data than could be found otherwise.
Approximate inference on planar graphs using loop calculus and belief progagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chertkov, Michael; Gomez, Vicenc; Kappen, Hilbert
We introduce novel results for approximate inference on planar graphical models using the loop calculus framework. The loop calculus (Chertkov and Chernyak, 2006b) allows to express the exact partition function Z of a graphical model as a finite sum of terms that can be evaluated once the belief propagation (BP) solution is known. In general, full summation over all correction terms is intractable. We develop an algorithm for the approach presented in Chertkov et al. (2008) which represents an efficient truncation scheme on planar graphs and a new representation of the series in terms of Pfaffians of matrices. We analyzemore » in detail both the loop series and the Pfaffian series for models with binary variables and pairwise interactions, and show that the first term of the Pfaffian series can provide very accurate approximations. The algorithm outperforms previous truncation schemes of the loop series and is competitive with other state-of-the-art methods for approximate inference.« less
Sequential Monte Carlo for inference of latent ARMA time-series with innovations correlated in time
NASA Astrophysics Data System (ADS)
Urteaga, Iñigo; Bugallo, Mónica F.; Djurić, Petar M.
2017-12-01
We consider the problem of sequential inference of latent time-series with innovations correlated in time and observed via nonlinear functions. We accommodate time-varying phenomena with diverse properties by means of a flexible mathematical representation of the data. We characterize statistically such time-series by a Bayesian analysis of their densities. The density that describes the transition of the state from time t to the next time instant t+1 is used for implementation of novel sequential Monte Carlo (SMC) methods. We present a set of SMC methods for inference of latent ARMA time-series with innovations correlated in time for different assumptions in knowledge of parameters. The methods operate in a unified and consistent manner for data with diverse memory properties. We show the validity of the proposed approach by comprehensive simulations of the challenging stochastic volatility model.
Data-driven reconstruction of directed networks
NASA Astrophysics Data System (ADS)
Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran
2013-06-01
We investigate the properties of a recently introduced asymmetric association measure, called inner composition alignment (IOTA), aimed at inferring regulatory links (couplings). We show that the measure can be used to determine the direction of coupling, detect superfluous links, and to account for autoregulation. In addition, the measure can be extended to infer the type of regulation (positive or negative). The capabilities of IOTA to correctly infer couplings together with their directionality are compared against Kendall's rank correlation for time series of different lengths, particularly focussing on biological examples. We demonstrate that an extended version of the measure, bidirectional inner composition alignment (biIOTA), increases the accuracy of the network reconstruction for short time series. Finally, we discuss the applicability of the measure to infer couplings in chaotic systems.
Efficient Bayesian inference for natural time series using ARFIMA processes
NASA Astrophysics Data System (ADS)
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-11-01
Many geophysical quantities, such as atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long memory (LM). LM implies that these quantities experience non-trivial temporal memory, which potentially not only enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LM. In this paper we present a modern and systematic approach to the inference of LM. We use the flexible autoregressive fractional integrated moving average (ARFIMA) model, which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LM, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g., short-memory effects) can be integrated over in order to focus on long-memory parameters and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data and the central England temperature (CET) time series, with favorable comparison to the standard estimators. For CET we also extend our method to seasonal long memory.
On the Inference of Functional Circadian Networks Using Granger Causality
Pourzanjani, Arya; Herzog, Erik D.; Petzold, Linda R.
2015-01-01
Being able to infer one way direct connections in an oscillatory network such as the suprachiastmatic nucleus (SCN) of the mammalian brain using time series data is difficult but crucial to understanding network dynamics. Although techniques have been developed for inferring networks from time series data, there have been no attempts to adapt these techniques to infer directional connections in oscillatory time series, while accurately distinguishing between direct and indirect connections. In this paper an adaptation of Granger Causality is proposed that allows for inference of circadian networks and oscillatory networks in general called Adaptive Frequency Granger Causality (AFGC). Additionally, an extension of this method is proposed to infer networks with large numbers of cells called LASSO AFGC. The method was validated using simulated data from several different networks. For the smaller networks the method was able to identify all one way direct connections without identifying connections that were not present. For larger networks of up to twenty cells the method shows excellent performance in identifying true and false connections; this is quantified by an area-under-the-curve (AUC) 96.88%. We note that this method like other Granger Causality-based methods, is based on the detection of high frequency signals propagating between cell traces. Thus it requires a relatively high sampling rate and a network that can propagate high frequency signals. PMID:26413748
Causal strength induction from time series data.
Soo, Kevin W; Rottman, Benjamin M
2018-04-01
One challenge when inferring the strength of cause-effect relations from time series data is that the cause and/or effect can exhibit temporal trends. If temporal trends are not accounted for, a learner could infer that a causal relation exists when it does not, or even infer that there is a positive causal relation when the relation is negative, or vice versa. We propose that learners use a simple heuristic to control for temporal trends-that they focus not on the states of the cause and effect at a given instant, but on how the cause and effect change from one observation to the next, which we call transitions. Six experiments were conducted to understand how people infer causal strength from time series data. We found that participants indeed use transitions in addition to states, which helps them to reach more accurate causal judgments (Experiments 1A and 1B). Participants use transitions more when the stimuli are presented in a naturalistic visual format than a numerical format (Experiment 2), and the effect of transitions is not driven by primacy or recency effects (Experiment 3). Finally, we found that participants primarily use the direction in which variables change rather than the magnitude of the change for estimating causal strength (Experiments 4 and 5). Collectively, these studies provide evidence that people often use a simple yet effective heuristic for inferring causal strength from time series data. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.
Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A
2017-08-07
High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Prefrontal Cortex: Role in Acquisition of Overlapping Associations and Transitive Inference
ERIC Educational Resources Information Center
DeVito, Loren M.; Lykken, Christine; Kanter, Benjamin R.; Eichenbaum, Howard
2010-01-01
"Transitive inference" refers to the ability to judge from memory the relationships between indirectly related items that compose a hierarchically organized series, and this capacity is considered a fundamental feature of relational memory. Here we explored the role of the prefrontal cortex in transitive inference by examining the performance of…
Inference of gene regulatory networks from time series by Tsallis entropy
2011-01-01
Background The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed. Results In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. Conclusions A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/. PMID:21545720
Efficient Bayesian inference for natural time series using ARFIMA processes
NASA Astrophysics Data System (ADS)
Graves, Timothy; Gramacy, Robert; Franzke, Christian; Watkins, Nicholas
2016-04-01
Many geophysical quantities, such as atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long memory (LM). LM implies that these quantities experience non-trivial temporal memory, which potentially not only enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LM. We present a modern and systematic approach to the inference of LM. We use the flexible autoregressive fractional integrated moving average (ARFIMA) model, which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LM, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g., short-memory effects) can be integrated over in order to focus on long-memory parameters and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data and the central England temperature (CET) time series, with favorable comparison to the standard estimators [1]. In addition we show how the method can be used to perform joint inference of the stability exponent and the memory parameter when ARFIMA is extended to allow for alpha-stable innovations. Such models can be used to study systems where heavy tails and long range memory coexist. [1] Graves et al, Nonlin. Processes Geophys., 22, 679-700, 2015; doi:10.5194/npg-22-679-2015.
Discovering time-lagged rules from microarray data using gene profile classifiers
2011-01-01
Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation. PMID:21524308
Evaluating data-driven causal inference techniques in noisy physical and ecological systems
NASA Astrophysics Data System (ADS)
Tennant, C.; Larsen, L.
2016-12-01
Causal inference from observational time series challenges traditional approaches for understanding processes and offers exciting opportunities to gain new understanding of complex systems where nonlinearity, delayed forcing, and emergent behavior are common. We present a formal evaluation of the performance of convergent cross-mapping (CCM) and transfer entropy (TE) for data-driven causal inference under real-world conditions. CCM is based on nonlinear state-space reconstruction, and causality is determined by the convergence of prediction skill with an increasing number of observations of the system. TE is the uncertainty reduction based on transition probabilities of a pair of time-lagged variables. With TE, causal inference is based on asymmetry in information flow between the variables. Observational data and numerical simulations from a number of classical physical and ecological systems: atmospheric convection (the Lorenz system), species competition (patch-tournaments), and long-term climate change (Vostok ice core) were used to evaluate the ability of CCM and TE to infer causal-relationships as data series become increasingly corrupted by observational (instrument-driven) or process (model-or -stochastic-driven) noise. While both techniques show promise for causal inference, TE appears to be applicable to a wider range of systems, especially when the data series are of sufficient length to reliably estimate transition probabilities of system components. Both techniques also show a clear effect of observational noise on causal inference. For example, CCM exhibits a negative logarithmic decline in prediction skill as the noise level of the system increases. Changes in TE strongly depend on noise type and which variable the noise was added to. The ability of CCM and TE to detect driving influences suggest that their application to physical and ecological systems could be transformative for understanding driving mechanisms as Earth systems undergo change.
Construction of regulatory networks using expression time-series data of a genotyped population.
Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E
2011-11-29
The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.
Identification and Inference for Econometric Models
NASA Astrophysics Data System (ADS)
Andrews, Donald W. K.; Stock, James H.
2005-07-01
This volume contains the papers presented in honor of the lifelong achievements of Thomas J. Rothenberg on the occasion of his retirement. The authors of the chapters include many of the leading econometricians of our day, and the chapters address topics of current research significance in econometric theory. The chapters cover four themes: identification and efficient estimation in econometrics, asymptotic approximations to the distributions of econometric estimators and tests, inference involving potentially nonstationary time series, such as processes that might have a unit autoregressive root, and nonparametric and semiparametric inference. Several of the chapters provide overviews and treatments of basic conceptual issues, while others advance our understanding of the properties of existing econometric procedures and/or propose new ones. Specific topics include identification in nonlinear models, inference with weak instruments, tests for nonstationary in time series and panel data, generalized empirical likelihood estimation, and the bootstrap.
Quasi-Experimental Designs for Causal Inference
ERIC Educational Resources Information Center
Kim, Yongnam; Steiner, Peter
2016-01-01
When randomized experiments are infeasible, quasi-experimental designs can be exploited to evaluate causal treatment effects. The strongest quasi-experimental designs for causal inference are regression discontinuity designs, instrumental variable designs, matching and propensity score designs, and comparative interrupted time series designs. This…
Presence-only modeling using MAXENT: when can we trust the inferences?
Yackulic, Charles B.; Chandler, Richard; Zipkin, Elise F.; Royle, J. Andrew; Nichols, James D.; Grant, Evan H. Campbell; Veran, Sophie
2013-01-01
5. We conclude with a series of recommendations foremost that researchers analyse data in a presence–absence framework whenever possible, because fewer assumptions are required and inferences can be made about clearly defined parameters such as occurrence probability.
ERIC Educational Resources Information Center
St. Clair, Travis; Hallberg, Kelly; Cook, Thomas D.
2014-01-01
Researchers are increasingly using comparative interrupted time series (CITS) designs to estimate the effects of programs and policies when randomized controlled trials are not feasible. In a simple interrupted time series design, researchers compare the pre-treatment values of a treatment group time series to post-treatment values in order to…
Processing Conversational Implicatures: Alternatives and Counterfactual Reasoning.
van Tiel, Bob; Schaeken, Walter
2017-05-01
In a series of experiments, Bott and Noveck (2004) found that the computation of scalar inferences, a variety of conversational implicature, caused a delay in response times. In order to determine what aspect of the inferential process that underlies scalar inferences caused this delay, we extended their paradigm to three other kinds of inferences: free choice inferences, conditional perfection, and exhaustivity in "it"-clefts. In contrast to scalar inferences, the computation of these three kinds of inferences facilitated response times. Following a suggestion made by Chemla and Bott (2014), we propose that the time it takes to compute a conversational implicature depends on the structural characteristics of the required alternatives. Copyright © 2016 Cognitive Science Society, Inc.
Characterizing and estimating noise in InSAR and InSAR time series with MODIS
Barnhart, William D.; Lohman, Rowena B.
2013-01-01
InSAR time series analysis is increasingly used to image subcentimeter displacement rates of the ground surface. The precision of InSAR observations is often affected by several noise sources, including spatially correlated noise from the turbulent atmosphere. Under ideal scenarios, InSAR time series techniques can substantially mitigate these effects; however, in practice the temporal distribution of InSAR acquisitions over much of the world exhibit seasonal biases, long temporal gaps, and insufficient acquisitions to confidently obtain the precisions desired for tectonic research. Here, we introduce a technique for constraining the magnitude of errors expected from atmospheric phase delays on the ground displacement rates inferred from an InSAR time series using independent observations of precipitable water vapor from MODIS. We implement a Monte Carlo error estimation technique based on multiple (100+) MODIS-based time series that sample date ranges close to the acquisitions times of the available SAR imagery. This stochastic approach allows evaluation of the significance of signals present in the final time series product, in particular their correlation with topography and seasonality. We find that topographically correlated noise in individual interferograms is not spatially stationary, even over short-spatial scales (<10 km). Overall, MODIS-inferred displacements and velocities exhibit errors of similar magnitude to the variability within an InSAR time series. We examine the MODIS-based confidence bounds in regions with a range of inferred displacement rates, and find we are capable of resolving velocities as low as 1.5 mm/yr with uncertainties increasing to ∼6 mm/yr in regions with higher topographic relief.
Abduallah, Yasser; Turki, Turki; Byron, Kevin; Du, Zongxuan; Cervantes-Cervantes, Miguel; Wang, Jason T L
2017-01-01
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
On statistical inference in time series analysis of the evolution of road safety.
Commandeur, Jacques J F; Bijleveld, Frits D; Bergel-Hayat, Ruth; Antoniou, Constantinos; Yannis, George; Papadimitriou, Eleonora
2013-11-01
Data collected for building a road safety observatory usually include observations made sequentially through time. Examples of such data, called time series data, include annual (or monthly) number of road traffic accidents, traffic fatalities or vehicle kilometers driven in a country, as well as the corresponding values of safety performance indicators (e.g., data on speeding, seat belt use, alcohol use, etc.). Some commonly used statistical techniques imply assumptions that are often violated by the special properties of time series data, namely serial dependency among disturbances associated with the observations. The first objective of this paper is to demonstrate the impact of such violations to the applicability of standard methods of statistical inference, which leads to an under or overestimation of the standard error and consequently may produce erroneous inferences. Moreover, having established the adverse consequences of ignoring serial dependency issues, the paper aims to describe rigorous statistical techniques used to overcome them. In particular, appropriate time series analysis techniques of varying complexity are employed to describe the development over time, relating the accident-occurrences to explanatory factors such as exposure measures or safety performance indicators, and forecasting the development into the near future. Traditional regression models (whether they are linear, generalized linear or nonlinear) are shown not to naturally capture the inherent dependencies in time series data. Dedicated time series analysis techniques, such as the ARMA-type and DRAG approaches are discussed next, followed by structural time series models, which are a subclass of state space methods. The paper concludes with general recommendations and practice guidelines for the use of time series models in road safety research. Copyright © 2012 Elsevier Ltd. All rights reserved.
Working memory supports inference learning just like classification learning.
Craig, Stewart; Lewandowsky, Stephan
2013-08-01
Recent research has found a positive relationship between people's working memory capacity (WMC) and their speed of category learning. To date, only classification-learning tasks have been considered, in which people learn to assign category labels to objects. It is unknown whether learning to make inferences about category features might also be related to WMC. We report data from a study in which 119 participants undertook classification learning and inference learning, and completed a series of WMC tasks. Working memory capacity was positively related to people's classification and inference learning performance.
ERIC Educational Resources Information Center
Ryan, Jennifer D.; Moses, Sandra N.; Villate, Christina
2009-01-01
The ability to perform relational proposition-based reasoning was assessed in younger and older adults using the transitive inference task in which subjects learned a series of premise pairs (A greater than B, B greater than C, C greater than D, D greater than E, E greater than F) and were asked to make inference judgments (B?D, B?E, C?E).…
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
Estimating mountain basin-mean precipitation from streamflow using Bayesian inference
NASA Astrophysics Data System (ADS)
Henn, Brian; Clark, Martyn P.; Kavetski, Dmitri; Lundquist, Jessica D.
2015-10-01
Estimating basin-mean precipitation in complex terrain is difficult due to uncertainty in the topographical representativeness of precipitation gauges relative to the basin. To address this issue, we use Bayesian methodology coupled with a multimodel framework to infer basin-mean precipitation from streamflow observations, and we apply this approach to snow-dominated basins in the Sierra Nevada of California. Using streamflow observations, forcing data from lower-elevation stations, the Bayesian Total Error Analysis (BATEA) methodology and the Framework for Understanding Structural Errors (FUSE), we infer basin-mean precipitation, and compare it to basin-mean precipitation estimated using topographically informed interpolation from gauges (PRISM, the Parameter-elevation Regression on Independent Slopes Model). The BATEA-inferred spatial patterns of precipitation show agreement with PRISM in terms of the rank of basins from wet to dry but differ in absolute values. In some of the basins, these differences may reflect biases in PRISM, because some implied PRISM runoff ratios may be inconsistent with the regional climate. We also infer annual time series of basin precipitation using a two-step calibration approach. Assessment of the precision and robustness of the BATEA approach suggests that uncertainty in the BATEA-inferred precipitation is primarily related to uncertainties in hydrologic model structure. Despite these limitations, time series of inferred annual precipitation under different model and parameter assumptions are strongly correlated with one another, suggesting that this approach is capable of resolving year-to-year variability in basin-mean precipitation.
Kernel canonical-correlation Granger causality for multiple time series
NASA Astrophysics Data System (ADS)
Wu, Guorong; Duan, Xujun; Liao, Wei; Gao, Qing; Chen, Huafu
2011-04-01
Canonical-correlation analysis as a multivariate statistical technique has been applied to multivariate Granger causality analysis to infer information flow in complex systems. It shows unique appeal and great superiority over the traditional vector autoregressive method, due to the simplified procedure that detects causal interaction between multiple time series, and the avoidance of potential model estimation problems. However, it is limited to the linear case. Here, we extend the framework of canonical correlation to include the estimation of multivariate nonlinear Granger causality for drawing inference about directed interaction. Its feasibility and effectiveness are verified on simulated data.
Processing Conversational Implicatures: Alternatives and Counterfactual Reasoning
ERIC Educational Resources Information Center
Tiel, Bob; Schaeken, Walter
2017-01-01
In a series of experiments, Bott and Noveck (2004) found that the computation of scalar inferences, a variety of conversational implicature, caused a delay in response times. In order to determine what aspect of the inferential process that underlies scalar inferences caused this delay, we extended their paradigm to three other kinds of…
Developing Young Children's Emergent Inferential Practices in Statistics
ERIC Educational Resources Information Center
Makar, Katie
2016-01-01
Informal statistical inference has now been researched at all levels of schooling and initial tertiary study. Work in informal statistical inference is least understood in the early years, where children have had little if any exposure to data handling. A qualitative study in Australia was carried out through a series of teaching experiments with…
Boolean dynamics of genetic regulatory networks inferred from microarray time series data
Martin, Shawn; Zhang, Zhaoduo; Martino, Anthony; ...
2007-01-31
Methods available for the inference of genetic regulatory networks strive to produce a single network, usually by optimizing some quantity to fit the experimental observations. In this paper we investigate the possibility that multiple networks can be inferred, all resulting in similar dynamics. This idea is motivated by theoretical work which suggests that biological networks are robust and adaptable to change, and that the overall behavior of a genetic regulatory network might be captured in terms of dynamical basins of attraction. We have developed and implemented a method for inferring genetic regulatory networks for time series microarray data. Our methodmore » first clusters and discretizes the gene expression data using k-means and support vector regression. We then enumerate Boolean activation–inhibition networks to match the discretized data. In conclusion, the dynamics of the Boolean networks are examined. We have tested our method on two immunology microarray datasets: an IL-2-stimulated T cell response dataset and a LPS-stimulated macrophage response dataset. In both cases, we discovered that many networks matched the data, and that most of these networks had similar dynamics.« less
Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.
Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J
2013-11-12
Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.
Cognitive neuropsychology and its vicissitudes: The fate of Caramazza's axioms.
Shallice, Tim
2015-01-01
Cognitive neuropsychology is characterized as the discipline in which one draws conclusions about the organization of the normal cognitive systems from the behaviour of brain-damaged individuals. In a series of papers, Caramazza, later in collaboration with McCloskey, put forward four assumptions as the bridge principles for making such inferences. Four potential pitfalls, one for each axiom, are discussed with respect to the use of single-case methods. Two of the pitfalls also apply to case series and group study procedures, and the other two are held to be indirectly testable or avoidable. Moreover, four other pitfalls are held to apply to case series or group study methods. It is held that inferences from single-case procedures may profitably be supported or rejected using case series/group study methods, but also that analogous support needs to be given in the other direction for functionally based case series or group studies. It is argued that at least six types of neuropsychological method are valuable for extrapolation to theories of the normal cognitive system but that the single- or multiple-case study remains a critical part of cognitive neuropsychology's methods.
Satellite Emission Range Inferred Earth Survey (SERIES) project
NASA Technical Reports Server (NTRS)
Buennagel, L. A.; Macdoran, P. F.; Neilan, R. E.; Spitzmesser, D. J.; Young, L. E.
1984-01-01
The Global Positioning System (GPS) was developed by the Department of Defense primarily for navigation use by the United States Armed Forces. The system will consist of a constellation of 18 operational Navigation Satellite Timing and Ranging (NAVSTAR) satellites by the late 1980's. During the last four years, the Satellite Emission Range Inferred Earth Surveying (SERIES) team at the Jet Propulsion Laboratory (JPL) has developed a novel receiver which is the heart of the SERIES geodetic system designed to use signals broadcast from the GPS. This receiver does not require knowledge of the exact code sequence being transmitted. In addition, when two SERIES receivers are used differentially to determine a baseline, few cm accuracies can be obtained. The initial engineering test phase has been completed for the SERIES Project. Baseline lengths, ranging from 150 meters to 171 kilometers, have been measured with 0.3 cm to 7 cm accuracies. This technology, which is sponsored by the NASA Geodynamics Program, has been developed at JPL to meet the challenge for high precision, cost-effective geodesy, and to complement the mobile Very Long Baseline Interferometry (VLBI) system for Earth surveying.
dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data.
Huynh-Thu, Vân Anh; Geurts, Pierre
2018-02-21
The elucidation of gene regulatory networks is one of the major challenges of systems biology. Measurements about genes that are exploited by network inference methods are typically available either in the form of steady-state expression vectors or time series expression data. In our previous work, we proposed the GENIE3 method that exploits variable importance scores derived from Random forests to identify the regulators of each target gene. This method provided state-of-the-art performance on several benchmark datasets, but it could however not specifically be applied to time series expression data. We propose here an adaptation of the GENIE3 method, called dynamical GENIE3 (dynGENIE3), for handling both time series and steady-state expression data. The proposed method is evaluated extensively on the artificial DREAM4 benchmarks and on three real time series expression datasets. Although dynGENIE3 does not systematically yield the best performance on each and every network, it is competitive with diverse methods from the literature, while preserving the main advantages of GENIE3 in terms of scalability.
Causal learning with local computations.
Fernbach, Philip M; Sloman, Steven A
2009-05-01
The authors proposed and tested a psychological theory of causal structure learning based on local computations. Local computations simplify complex learning problems via cues available on individual trials to update a single causal structure hypothesis. Structural inferences from local computations make minimal demands on memory, require relatively small amounts of data, and need not respect normative prescriptions as inferences that are principled locally may violate those principles when combined. Over a series of 3 experiments, the authors found (a) systematic inferences from small amounts of data; (b) systematic inference of extraneous causal links; (c) influence of data presentation order on inferences; and (d) error reduction through pretraining. Without pretraining, a model based on local computations fitted data better than a Bayesian structural inference model. The data suggest that local computations serve as a heuristic for learning causal structure. Copyright 2009 APA, all rights reserved.
USDA-ARS?s Scientific Manuscript database
Solanum series Conicibaccata is the second largest series in sect. Petota, containing 40 species widely distributed from southern Mexico to central Bolivia. It contains diploids (2n = 2x = 24), tetraploids (2n = 4x = 48) and hexaploids (2n = 6x = 72), and a limited number of examined species have be...
Quasi-experimental study designs series-paper 7: assessing the assumptions.
Bärnighausen, Till; Oldenburg, Catherine; Tugwell, Peter; Bommer, Christian; Ebert, Cara; Barreto, Mauricio; Djimeu, Eric; Haber, Noah; Waddington, Hugh; Rockers, Peter; Sianesi, Barbara; Bor, Jacob; Fink, Günther; Valentine, Jeffrey; Tanner, Jeffrey; Stanley, Tom; Sierra, Eduardo; Tchetgen, Eric Tchetgen; Atun, Rifat; Vollmer, Sebastian
2017-09-01
Quasi-experimental designs are gaining popularity in epidemiology and health systems research-in particular for the evaluation of health care practice, programs, and policy-because they allow strong causal inferences without randomized controlled experiments. We describe the concepts underlying five important quasi-experimental designs: Instrumental Variables, Regression Discontinuity, Interrupted Time Series, Fixed Effects, and Difference-in-Differences designs. We illustrate each of the designs with an example from health research. We then describe the assumptions required for each of the designs to ensure valid causal inference and discuss the tests available to examine the assumptions. Copyright © 2017 Elsevier Inc. All rights reserved.
Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference
Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.
2015-01-01
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922
ERIC Educational Resources Information Center
Johnson, Clay Stephen
2013-01-01
Synthetic control methods are an innovative matching technique first introduced within the economics and political science literature that have begun to find application in educational research as well. Synthetic controls create an aggregate-level, time-series comparison for a single treated unit of interest for causal inference with observational…
Quantifying Transmission Heterogeneity Using Both Pathogen Phylogenies and Incidence Time Series
Li, Lucy M.; Grassly, Nicholas C.; Fraser, Christophe
2017-01-01
Abstract Heterogeneity in individual-level transmissibility can be quantified by the dispersion parameter k of the offspring distribution. Quantifying heterogeneity is important as it affects other parameter estimates, it modulates the degree of unpredictability of an epidemic, and it needs to be accounted for in models of infection control. Aggregated data such as incidence time series are often not sufficiently informative to estimate k. Incorporating phylogenetic analysis can help to estimate k concurrently with other epidemiological parameters. We have developed an inference framework that uses particle Markov Chain Monte Carlo to estimate k and other epidemiological parameters using both incidence time series and the pathogen phylogeny. Using the framework to fit a modified compartmental transmission model that includes the parameter k to simulated data, we found that more accurate and less biased estimates of the reproductive number were obtained by combining epidemiological and phylogenetic analyses. However, k was most accurately estimated using pathogen phylogeny alone. Accurately estimating k was necessary for unbiased estimates of the reproductive number, but it did not affect the accuracy of reporting probability and epidemic start date estimates. We further demonstrated that inference was possible in the presence of phylogenetic uncertainty by sampling from the posterior distribution of phylogenies. Finally, we used the inference framework to estimate transmission parameters from epidemiological and genetic data collected during a poliovirus outbreak. Despite the large degree of phylogenetic uncertainty, we demonstrated that incorporating phylogenetic data in parameter inference improved the accuracy and precision of estimates. PMID:28981709
He, Feng; Zeng, An-Ping
2006-01-01
Background The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. Present methods used to infer functional linkages between genes from expression data are mainly based on a point-to-point comparison. Change trends between consecutive time points in time-series data have been so far not well explored. Results In this work we present a new method based on extracting main features of the change trend and level of gene expression between consecutive time points. The method, termed as trend correlation (TC), includes two major steps: 1, calculating a maximal local alignment of change trend score by dynamic programming and a change trend correlation coefficient between the maximal matched change levels of each gene pair; 2, inferring relationships of gene pairs based on two statistical extraction procedures. The new method considers time shifts and inverted relationships in a similar way as the local clustering (LC) method but the latter is merely based on a point-to-point comparison. The TC method is demonstrated with data from yeast cell cycle and compared with the LC method and the widely used Pearson correlation coefficient (PCC) based clustering method. The biological significance of the gene pairs is examined with several large-scale yeast databases. Although the TC method predicts an overall lower number of gene pairs than the other two methods at a same p-value threshold, the additional number of gene pairs inferred by the TC method is considerable: e.g. 20.5% compared with the LC method and 49.6% with the PCC method for a p-value threshold of 2.7E-3. Moreover, the percentage of the inferred gene pairs consistent with databases by our method is generally higher than the LC method and similar to the PCC method. A significant number of the gene pairs only inferred by the TC method are process-identity or function-similarity pairs or have well-documented biological interactions, including 443 known protein interactions and some known cell cycle related regulatory interactions. It should be emphasized that the overlapping of gene pairs detected by the three methods is normally not very high, indicating a necessity of combining the different methods in search of functional association of genes from time-series data. For a p-value threshold of 1E-5 the percentage of process-identity and function-similarity gene pairs among the shared part of the three methods reaches 60.2% and 55.6% respectively, building a good basis for further experimental and functional study. Furthermore, the combined use of methods is important to infer more complete regulatory circuits and network as exemplified in this study. Conclusion The TC method can significantly augment the current major methods to infer functional linkages and biological network and is well suitable for exploring temporal relationships of gene expression in time-series data. PMID:16478547
Identifying hidden common causes from bivariate time series: a method using recurrence plots.
Hirata, Yoshito; Aihara, Kazuyuki
2010-01-01
We propose a method for inferring the existence of hidden common causes from observations of bivariate time series. We detect related time series by excessive simultaneous recurrences in the corresponding recurrence plots. We also use a noncoverage property of a recurrence plot by the other to deny the existence of a directional coupling. We apply the proposed method to real wind data.
Automated Bayesian model development for frequency detection in biological time series.
Granqvist, Emma; Oldroyd, Giles E D; Morris, Richard J
2011-06-24
A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure.
Automated Bayesian model development for frequency detection in biological time series
2011-01-01
Background A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. Results In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Conclusions Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure. PMID:21702910
Time series association learning
Papcun, George J.
1995-01-01
An acoustic input is recognized from inferred articulatory movements output by a learned relationship between training acoustic waveforms and articulatory movements. The inferred movements are compared with template patterns prepared from training movements when the relationship was learned to regenerate an acoustic recognition. In a preferred embodiment, the acoustic articulatory relationships are learned by a neural network. Subsequent input acoustic patterns then generate the inferred articulatory movements for use with the templates. Articulatory movement data may be supplemented with characteristic acoustic information, e.g. relative power and high frequency data, to improve template recognition.
Modrák, Martin; Vohradský, Jiří
2018-04-13
Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.
Jewett, Ethan M.; Steinrücken, Matthias; Song, Yun S.
2016-01-01
Many approaches have been developed for inferring selection coefficients from time series data while accounting for genetic drift. These approaches have been motivated by the intuition that properly accounting for the population size history can significantly improve estimates of selective strengths. However, the improvement in inference accuracy that can be attained by modeling drift has not been characterized. Here, by comparing maximum likelihood estimates of selection coefficients that account for the true population size history with estimates that ignore drift by assuming allele frequencies evolve deterministically in a population of infinite size, we address the following questions: how much can modeling the population size history improve estimates of selection coefficients? How much can mis-inferred population sizes hurt inferences of selection coefficients? We conduct our analysis under the discrete Wright–Fisher model by deriving the exact probability of an allele frequency trajectory in a population of time-varying size and we replicate our results under the diffusion model. For both models, we find that ignoring drift leads to estimates of selection coefficients that are nearly as accurate as estimates that account for the true population history, even when population sizes are small and drift is high. This result is of interest because inference methods that ignore drift are widely used in evolutionary studies and can be many orders of magnitude faster than methods that account for population sizes. PMID:27550904
Evaluation of recent GRACE monthly solution series with an ice sheet perspective
NASA Astrophysics Data System (ADS)
Horwath, Martin; Groh, Andreas
2016-04-01
GRACE monthly global gravity field solutions have undergone a remarkable evolution, leading to the latest (Release 5) series by CSR, GFZ, and JPL, to new series by other processing centers, such as ITSG and AIUB, as well as to efforts to derive combined solutions, particularly by the EGSIEM (European Gravity Service for Improved Emergency Management) project. For applications, such as GRACE inferences on ice sheet mass balance, the obvious question is on what GRACE solution series to base the assessment. Here we evaluate different GRACE solution series (including the ones listed above) in a unified framework. We concentrate on solutions expanded up to degree 90 or higher, since this is most appropriate for polar applications. We empirically assess the error levels in the spectral as well as in the spatial domain based on the month-to-month scatter in the high spherical harmonic degrees. We include empirical assessment of error correlations. We then apply all series to infer Antarctic and Greenland mass change time series and compare the results in terms of apparent signal content and noise level. We find that the ITSG solutions show lowest noise level in the high degrees (above 60). A preliminary combined solution from the EGSIEM project shows lowest noise in the degrees below 60. This virtue maps into the derived ice mass time series, where the EGSIEM-based results show the lowest noise in most cases. Meanwhile, there is no indication that any of the considered series systematically dampens actual geophysical signals.
Analogical and category-based inference: a theoretical integration with Bayesian causal models.
Holyoak, Keith J; Lee, Hee Seung; Lu, Hongjing
2010-11-01
A fundamental issue for theories of human induction is to specify constraints on potential inferences. For inferences based on shared category membership, an analogy, and/or a relational schema, it appears that the basic goal of induction is to make accurate and goal-relevant inferences that are sensitive to uncertainty. People can use source information at various levels of abstraction (including both specific instances and more general categories), coupled with prior causal knowledge, to build a causal model for a target situation, which in turn constrains inferences about the target. We propose a computational theory in the framework of Bayesian inference and test its predictions (parameter-free for the cases we consider) in a series of experiments in which people were asked to assess the probabilities of various causal predictions and attributions about a target on the basis of source knowledge about generative and preventive causes. The theory proved successful in accounting for systematic patterns of judgments about interrelated types of causal inferences, including evidence that analogical inferences are partially dissociable from overall mapping quality.
Narimani, Zahra; Beigy, Hamid; Ahmad, Ashar; Masoudi-Nejad, Ali; Fröhlich, Holger
2017-01-01
Inferring the structure of molecular networks from time series protein or gene expression data provides valuable information about the complex biological processes of the cell. Causal network structure inference has been approached using different methods in the past. Most causal network inference techniques, such as Dynamic Bayesian Networks and ordinary differential equations, are limited by their computational complexity and thus make large scale inference infeasible. This is specifically true if a Bayesian framework is applied in order to deal with the unavoidable uncertainty about the correct model. We devise a novel Bayesian network reverse engineering approach using ordinary differential equations with the ability to include non-linearity. Besides modeling arbitrary, possibly combinatorial and time dependent perturbations with unknown targets, one of our main contributions is the use of Expectation Propagation, an algorithm for approximate Bayesian inference over large scale network structures in short computation time. We further explore the possibility of integrating prior knowledge into network inference. We evaluate the proposed model on DREAM4 and DREAM8 data and find it competitive against several state-of-the-art existing network inference methods.
Stein, Richard R; Bucci, Vanni; Toussaint, Nora C; Buffie, Charlie G; Rätsch, Gunnar; Pamer, Eric G; Sander, Chris; Xavier, João B
2013-01-01
The intestinal microbiota is a microbial ecosystem of crucial importance to human health. Understanding how the microbiota confers resistance against enteric pathogens and how antibiotics disrupt that resistance is key to the prevention and cure of intestinal infections. We present a novel method to infer microbial community ecology directly from time-resolved metagenomics. This method extends generalized Lotka-Volterra dynamics to account for external perturbations. Data from recent experiments on antibiotic-mediated Clostridium difficile infection is analyzed to quantify microbial interactions, commensal-pathogen interactions, and the effect of the antibiotic on the community. Stability analysis reveals that the microbiota is intrinsically stable, explaining how antibiotic perturbations and C. difficile inoculation can produce catastrophic shifts that persist even after removal of the perturbations. Importantly, the analysis suggests a subnetwork of bacterial groups implicated in protection against C. difficile. Due to its generality, our method can be applied to any high-resolution ecological time-series data to infer community structure and response to external stimuli.
Toussaint, Nora C.; Buffie, Charlie G.; Rätsch, Gunnar; Pamer, Eric G.; Sander, Chris; Xavier, João B.
2013-01-01
The intestinal microbiota is a microbial ecosystem of crucial importance to human health. Understanding how the microbiota confers resistance against enteric pathogens and how antibiotics disrupt that resistance is key to the prevention and cure of intestinal infections. We present a novel method to infer microbial community ecology directly from time-resolved metagenomics. This method extends generalized Lotka–Volterra dynamics to account for external perturbations. Data from recent experiments on antibiotic-mediated Clostridium difficile infection is analyzed to quantify microbial interactions, commensal-pathogen interactions, and the effect of the antibiotic on the community. Stability analysis reveals that the microbiota is intrinsically stable, explaining how antibiotic perturbations and C. difficile inoculation can produce catastrophic shifts that persist even after removal of the perturbations. Importantly, the analysis suggests a subnetwork of bacterial groups implicated in protection against C. difficile. Due to its generality, our method can be applied to any high-resolution ecological time-series data to infer community structure and response to external stimuli. PMID:24348232
Modular evolution of the Cetacean vertebral column.
Buchholtz, Emily A
2007-01-01
Modular theory predicts that hierarchical developmental processes generate hierarchical phenotypic units that are capable of independent modification. The vertebral column is an overtly modular structure, and its rapid phenotypic transformation in cetacean evolution provides a case study for modularity. Terrestrial mammals have five morphologically discrete vertebral series that are now known to be coincident with Hox gene expression patterns. Here, I present the hypothesis that in living Carnivora and Artiodactyla, and by inference in the terrestrial ancestors of whales, the series are themselves components of larger precaudal and caudal modular units. Column morphology in a series of fossil and living whales is used to predict the type and sequence of developmental changes responsible for modification of that ancestral pattern. Developmental innovations inferred include independent meristic additions to the precaudal column in basal archaeocetes and basilosaurids, stepwise homeotic reduction of the sacral series in protocetids, and dissociation of the caudal series into anterior tail and fluke subunits in basilosaurids. The most dramatic change was the novel association of lumbar and anterior caudal vertebrae in a module that crosses the precaudal/caudal boundary. This large unit is defined by shared patterns of vertebral morphology, count, and size in all living whales (Neoceti).
Terry, Ellen L; France, Christopher R; Bartley, Emily J; Delventura, Jennifer L; Kerr, Kara L; Vincent, Ashley L; Rhudy, Jamie L
2011-09-01
Temporal summation of pain (TS-pain) is the progressive increase in pain ratings during a series of noxious stimulations. TS-pain has been used to make inferences about sensitization of spinal nociceptive processes; however, pain report can be biased thereby leading to problems with this inference. Temporal summation of the nociceptive flexion reflex (TS-NFR, a physiological measure of spinal nociception) can potentially overcome report bias, but there have been few attempts (generally with small Ns) to standardize TS-NFR procedures. In this study, 50 healthy participants received 25 series of noxious electric stimulations to evoke TS-NFR and TS-pain. Goals were to: 1) determine the stimulation frequency that best elicits TS-NFR and reduces electromyogram (EMG) contamination from muscle tension, 2) determine the minimum number of stimulations per series before NFR summation asymptotes, 3) compare NFR definition intervals (90-150ms vs. 70-150ms post-stimulation), and 4) compare TS-pain and TS-NFR when different stimulation frequencies are used. Results indicated TS-NFR should be elicited by a series of three stimuli delivered at 2.0Hz and TS-NFR should be defined from a 70-150ms post-stimulation scoring interval. Unfortunately, EMG contamination from muscle tension was greatest during 2.0Hz series. Discrepancies were noted between TS-NFR and TS-pain which raise concerns about using pain ratings to infer changes in spinal nociceptive processes. And finally, some individuals did not have reliable NFRs when the stimulation intensity was set at NFR threshold during TS-NFR testing; therefore, a higher intensity is needed. Implications of findings are discussed. Copyright © 2011 Elsevier B.V. All rights reserved.
Kim, Jongrae; Bates, Declan G; Postlethwaite, Ian; Heslop-Harrison, Pat; Cho, Kwang-Hyun
2008-05-15
Inherent non-linearities in biomolecular interactions make the identification of network interactions difficult. One of the principal problems is that all methods based on the use of linear time-invariant models will have fundamental limitations in their capability to infer certain non-linear network interactions. Another difficulty is the multiplicity of possible solutions, since, for a given dataset, there may be many different possible networks which generate the same time-series expression profiles. A novel algorithm for the inference of biomolecular interaction networks from temporal expression data is presented. Linear time-varying models, which can represent a much wider class of time-series data than linear time-invariant models, are employed in the algorithm. From time-series expression profiles, the model parameters are identified by solving a non-linear optimization problem. In order to systematically reduce the set of possible solutions for the optimization problem, a filtering process is performed using a phase-portrait analysis with random numerical perturbations. The proposed approach has the advantages of not requiring the system to be in a stable steady state, of using time-series profiles which have been generated by a single experiment, and of allowing non-linear network interactions to be identified. The ability of the proposed algorithm to correctly infer network interactions is illustrated by its application to three examples: a non-linear model for cAMP oscillations in Dictyostelium discoideum, the cell-cycle data for Saccharomyces cerevisiae and a large-scale non-linear model of a group of synchronized Dictyostelium cells. The software used in this article is available from http://sbie.kaist.ac.kr/software
Causal discovery and inference: concepts and recent methodological advances.
Spirtes, Peter; Zhang, Kun
This paper aims to give a broad coverage of central concepts and principles involved in automated causal inference and emerging approaches to causal discovery from i.i.d data and from time series. After reviewing concepts including manipulations, causal models, sample predictive modeling, causal predictive modeling, and structural equation models, we present the constraint-based approach to causal discovery, which relies on the conditional independence relationships in the data, and discuss the assumptions underlying its validity. We then focus on causal discovery based on structural equations models, in which a key issue is the identifiability of the causal structure implied by appropriately defined structural equation models: in the two-variable case, under what conditions (and why) is the causal direction between the two variables identifiable? We show that the independence between the error term and causes, together with appropriate structural constraints on the structural equation, makes it possible. Next, we report some recent advances in causal discovery from time series. Assuming that the causal relations are linear with nonGaussian noise, we mention two problems which are traditionally difficult to solve, namely causal discovery from subsampled data and that in the presence of confounding time series. Finally, we list a number of open questions in the field of causal discovery and inference.
NASA Astrophysics Data System (ADS)
Schindler, Maike; Hußmann, Stephan; Nilsson, Per; Bakker, Arthur
2017-12-01
Negative numbers are among the first formalizations students encounter in their mathematics learning that clearly differ from out-of-school experiences. What has not sufficiently been addressed in previous research is the question of how students draw on their prior experiences when reasoning on negative numbers and how they infer from these experiences. This article presents results from an empirical study investigating sixth-grade students' reasoning and inferring from school-based and out-of-school experiences. In particular, it addresses the order relation, which deals with students' very first encounters with negative numbers. Here, students can reason in different ways, depending on the experiences they draw on. We study how students reason before a lesson series and how their reasoning is influenced through this lesson series where the number line and the context debts-and-assets are predominant. For grasping the reasoning's inferential and social nature and conducting in-depth analyses of two students' reasoning, we use an epistemological framework that is based on the philosophical theory of inferentialism. The results illustrate how the students infer their reasoning from out-of-school and from school-based experiences both before and after the lesson series. They reveal interesting phenomena not previously analyzed in the research on the order relation for integers.
Permutation entropy of finite-length white-noise time series.
Little, Douglas J; Kane, Deb M
2016-08-01
Permutation entropy (PE) is commonly used to discriminate complex structure from white noise in a time series. While the PE of white noise is well understood in the long time-series limit, analysis in the general case is currently lacking. Here the expectation value and variance of white-noise PE are derived as functions of the number of ordinal pattern trials, N, and the embedding dimension, D. It is demonstrated that the probability distribution of the white-noise PE converges to a χ^{2} distribution with D!-1 degrees of freedom as N becomes large. It is further demonstrated that the PE variance for an arbitrary time series can be estimated as the variance of a related metric, the Kullback-Leibler entropy (KLE), allowing the qualitative N≫D! condition to be recast as a quantitative estimate of the N required to achieve a desired PE calculation precision. Application of this theory to statistical inference is demonstrated in the case of an experimentally obtained noise series, where the probability of obtaining the observed PE value was calculated assuming a white-noise time series. Standard statistical inference can be used to draw conclusions whether the white-noise null hypothesis can be accepted or rejected. This methodology can be applied to other null hypotheses, such as discriminating whether two time series are generated from different complex system states.
A novel gene network inference algorithm using predictive minimum description length approach.
Chaitankar, Vijender; Ghosh, Preetam; Perkins, Edward J; Gong, Ping; Deng, Youping; Zhang, Chaoyang
2010-05-28
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold which defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we proposed a new inference algorithm which incorporated mutual information (MI), conditional mutual information (CMI) and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm was evaluated using both synthetic time series data sets and a biological time series data set for the yeast Saccharomyces cerevisiae. The benchmark quantities precision and recall were used as performance measures. The results show that the proposed algorithm produced less false edges and significantly improved the precision, as compared to the existing algorithm. For further analysis the performance of the algorithms was observed over different sizes of data. We have proposed a new algorithm that implements the PMDL principle for inferring gene regulatory networks from time series DNA microarray data that eliminates the need of a fine tuning parameter. The evaluation results obtained from both synthetic and actual biological data sets show that the PMDL principle is effective in determining the MI threshold and the developed algorithm improves precision of gene regulatory network inference. Based on the sensitivity analysis of all tested cases, an optimal CMI threshold value has been identified. Finally it was observed that the performance of the algorithms saturates at a certain threshold of data size.
NASA Astrophysics Data System (ADS)
Rehfeld, Kira; Goswami, Bedartha; Marwan, Norbert; Breitenbach, Sebastian; Kurths, Jürgen
2013-04-01
Statistical analysis of dependencies amongst paleoclimate data helps to infer on the climatic processes they reflect. Three key challenges have to be addressed, however: the datasets are heterogeneous in time (i) and space (ii), and furthermore time itself is a variable that needs to be reconstructed, which (iii) introduces additional uncertainties. To address these issues in a flexible way we developed the paleoclimate network framework, inspired by the increasing application of complex networks in climate research. Nodes in the paleoclimate network represent a paleoclimate archive, and an associated time series. Links between these nodes are assigned, if these time series are significantly similar. Therefore, the base of the paleoclimate network is formed by linear and nonlinear estimators for Pearson correlation, mutual information and event synchronization, which quantify similarity from irregularly sampled time series. Age uncertainties are propagated into the final network analysis using time series ensembles which reflect the uncertainty. We discuss how spatial heterogeneity influences the results obtained from network measures, and demonstrate the power of the approach by inferring teleconnection variability of the Asian summer monsoon for the past 1000 years.
Cardiovascular oscillations: in search of a nonlinear parametric model
NASA Astrophysics Data System (ADS)
Bandrivskyy, Andriy; Luchinsky, Dmitry; McClintock, Peter V.; Smelyanskiy, Vadim; Stefanovska, Aneta; Timucin, Dogan
2003-05-01
We suggest a fresh approach to the modeling of the human cardiovascular system. Taking advantage of a new Bayesian inference technique, able to deal with stochastic nonlinear systems, we show that one can estimate parameters for models of the cardiovascular system directly from measured time series. We present preliminary results of inference of parameters of a model of coupled oscillators from measured cardiovascular data addressing cardiorespiratory interaction. We argue that the inference technique offers a very promising tool for the modeling, able to contribute significantly towards the solution of a long standing challenge -- development of new diagnostic techniques based on noninvasive measurements.
Statistical Inference on Memory Structure of Processes and Its Applications to Information Theory
2016-05-12
valued times series from a sample. (A practical algorithm to compute the estimator is a work in progress.) Third, finitely-valued spatial processes...ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 mathematical statistics; time series ; Markov chains; random...proved. Second, a statistical method is developed to estimate the memory depth of discrete- time and continuously-valued times series from a sample. (A
Jewett, Ethan M; Steinrücken, Matthias; Song, Yun S
2016-11-01
Many approaches have been developed for inferring selection coefficients from time series data while accounting for genetic drift. These approaches have been motivated by the intuition that properly accounting for the population size history can significantly improve estimates of selective strengths. However, the improvement in inference accuracy that can be attained by modeling drift has not been characterized. Here, by comparing maximum likelihood estimates of selection coefficients that account for the true population size history with estimates that ignore drift by assuming allele frequencies evolve deterministically in a population of infinite size, we address the following questions: how much can modeling the population size history improve estimates of selection coefficients? How much can mis-inferred population sizes hurt inferences of selection coefficients? We conduct our analysis under the discrete Wright-Fisher model by deriving the exact probability of an allele frequency trajectory in a population of time-varying size and we replicate our results under the diffusion model. For both models, we find that ignoring drift leads to estimates of selection coefficients that are nearly as accurate as estimates that account for the true population history, even when population sizes are small and drift is high. This result is of interest because inference methods that ignore drift are widely used in evolutionary studies and can be many orders of magnitude faster than methods that account for population sizes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
An Approximate Markov Model for the Wright-Fisher Diffusion and Its Application to Time Series Data.
Ferrer-Admetlla, Anna; Leuenberger, Christoph; Jensen, Jeffrey D; Wegmann, Daniel
2016-06-01
The joint and accurate inference of selection and demography from genetic data is considered a particularly challenging question in population genetics, since both process may lead to very similar patterns of genetic diversity. However, additional information for disentangling these effects may be obtained by observing changes in allele frequencies over multiple time points. Such data are common in experimental evolution studies, as well as in the comparison of ancient and contemporary samples. Leveraging this information, however, has been computationally challenging, particularly when considering multilocus data sets. To overcome these issues, we introduce a novel, discrete approximation for diffusion processes, termed mean transition time approximation, which preserves the long-term behavior of the underlying continuous diffusion process. We then derive this approximation for the particular case of inferring selection and demography from time series data under the classic Wright-Fisher model and demonstrate that our approximation is well suited to describe allele trajectories through time, even when only a few states are used. We then develop a Bayesian inference approach to jointly infer the population size and locus-specific selection coefficients with high accuracy and further extend this model to also infer the rates of sequencing errors and mutations. We finally apply our approach to recent experimental data on the evolution of drug resistance in influenza virus, identifying likely targets of selection and finding evidence for much larger viral population sizes than previously reported. Copyright © 2016 by the Genetics Society of America.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Working Memory and Processing Efficiency in Children's Reasoning.
ERIC Educational Resources Information Center
Halford, Graeme S.; And Others
A series of studies was conducted to determine whether children's reasoning is capacity-limited and whether any such capacity, if it exists, is based on the working memory system. An N-term series (transitive inference) was used as the primary task in an interference paradigm. A concurrent short-term memory load was employed as the secondary task.…
Empirical intrinsic geometry for nonlinear modeling and time series filtering.
Talmon, Ronen; Coifman, Ronald R
2013-07-30
In this paper, we present a method for time series analysis based on empirical intrinsic geometry (EIG). EIG enables one to reveal the low-dimensional parametric manifold as well as to infer the underlying dynamics of high-dimensional time series. By incorporating concepts of information geometry, this method extends existing geometric analysis tools to support stochastic settings and parametrizes the geometry of empirical distributions. However, the statistical models are not required as priors; hence, EIG may be applied to a wide range of real signals without existing definitive models. We show that the inferred model is noise-resilient and invariant under different observation and instrumental modalities. In addition, we show that it can be extended efficiently to newly acquired measurements in a sequential manner. These two advantages enable us to revisit the Bayesian approach and incorporate empirical dynamics and intrinsic geometry into a nonlinear filtering framework. We show applications to nonlinear and non-Gaussian tracking problems as well as to acoustic signal localization.
Spectral decompositions of multiple time series: a Bayesian non-parametric approach.
Macaro, Christian; Prado, Raquel
2014-01-01
We consider spectral decompositions of multiple time series that arise in studies where the interest lies in assessing the influence of two or more factors. We write the spectral density of each time series as a sum of the spectral densities associated to the different levels of the factors. We then use Whittle's approximation to the likelihood function and follow a Bayesian non-parametric approach to obtain posterior inference on the spectral densities based on Bernstein-Dirichlet prior distributions. The prior is strategically important as it carries identifiability conditions for the models and allows us to quantify our degree of confidence in such conditions. A Markov chain Monte Carlo (MCMC) algorithm for posterior inference within this class of frequency-domain models is presented.We illustrate the approach by analyzing simulated and real data via spectral one-way and two-way models. In particular, we present an analysis of functional magnetic resonance imaging (fMRI) brain responses measured in individuals who participated in a designed experiment to study pain perception in humans.
Global Warming Estimation from MSU
NASA Technical Reports Server (NTRS)
Prabhakara, C.; Iacovazzi, Robert, Jr.
1999-01-01
In this study, we have developed time series of global temperature from 1980-97 based on the Microwave Sounding Unit (MSU) Ch 2 (53.74 GHz) observations taken from polar-orbiting NOAA operational satellites. In order to create these time series, systematic errors (approx. 0.1 K) in the Ch 2 data arising from inter-satellite differences are removed objectively. On the other hand, smaller systematic errors (approx. 0.03 K) in the data due to orbital drift of each satellite cannot be removed objectively. Such errors are expected to remain in the time series and leave an uncertainty in the inferred global temperature trend. With the help of a statistical method, the error in the MSU inferred global temperature trend resulting from orbital drifts and residual inter-satellite differences of all satellites is estimated to be 0.06 K decade. Incorporating this error, our analysis shows that the global temperature increased at a rate of 0.13 +/- 0.06 K decade during 1980-97.
Palamara, Gian Marco; Childs, Dylan Z; Clements, Christopher F; Petchey, Owen L; Plebani, Marco; Smith, Matthew J
2014-01-01
Understanding and quantifying the temperature dependence of population parameters, such as intrinsic growth rate and carrying capacity, is critical for predicting the ecological responses to environmental change. Many studies provide empirical estimates of such temperature dependencies, but a thorough investigation of the methods used to infer them has not been performed yet. We created artificial population time series using a stochastic logistic model parameterized with the Arrhenius equation, so that activation energy drives the temperature dependence of population parameters. We simulated different experimental designs and used different inference methods, varying the likelihood functions and other aspects of the parameter estimation methods. Finally, we applied the best performing inference methods to real data for the species Paramecium caudatum. The relative error of the estimates of activation energy varied between 5% and 30%. The fraction of habitat sampled played the most important role in determining the relative error; sampling at least 1% of the habitat kept it below 50%. We found that methods that simultaneously use all time series data (direct methods) and methods that estimate population parameters separately for each temperature (indirect methods) are complementary. Indirect methods provide a clearer insight into the shape of the functional form describing the temperature dependence of population parameters; direct methods enable a more accurate estimation of the parameters of such functional forms. Using both methods, we found that growth rate and carrying capacity of Paramecium caudatum scale with temperature according to different activation energies. Our study shows how careful choice of experimental design and inference methods can increase the accuracy of the inferred relationships between temperature and population parameters. The comparison of estimation methods provided here can increase the accuracy of model predictions, with important implications in understanding and predicting the effects of temperature on the dynamics of populations. PMID:25558365
Detecting dynamic causal inference in nonlinear two-phase fracture flow
NASA Astrophysics Data System (ADS)
Faybishenko, Boris
2017-08-01
Identifying dynamic causal inference involved in flow and transport processes in complex fractured-porous media is generally a challenging task, because nonlinear and chaotic variables may be positively coupled or correlated for some periods of time, but can then become spontaneously decoupled or non-correlated. In his 2002 paper (Faybishenko, 2002), the author performed a nonlinear dynamical and chaotic analysis of time-series data obtained from the fracture flow experiment conducted by Persoff and Pruess (1995), and, based on the visual examination of time series data, hypothesized that the observed pressure oscillations at both inlet and outlet edges of the fracture result from a superposition of both forward and return waves of pressure propagation through the fracture. In the current paper, the author explores an application of a combination of methods for detecting nonlinear chaotic dynamics behavior along with the multivariate Granger Causality (G-causality) time series test. Based on the G-causality test, the author infers that his hypothesis is correct, and presents a causation loop diagram of the spatial-temporal distribution of gas, liquid, and capillary pressures measured at the inlet and outlet of the fracture. The causal modeling approach can be used for the analysis of other hydrological processes, for example, infiltration and pumping tests in heterogeneous subsurface media, and climatic processes, for example, to find correlations between various meteorological parameters, such as temperature, solar radiation, barometric pressure, etc.
State Space Model with hidden variables for reconstruction of gene regulatory networks.
Wu, Xi; Li, Peng; Wang, Nan; Gong, Ping; Perkins, Edward J; Deng, Youping; Zhang, Chaoyang
2011-01-01
State Space Model (SSM) is a relatively new approach to inferring gene regulatory networks. It requires less computational time than Dynamic Bayesian Networks (DBN). There are two types of variables in the linear SSM, observed variables and hidden variables. SSM uses an iterative method, namely Expectation-Maximization, to infer regulatory relationships from microarray datasets. The hidden variables cannot be directly observed from experiments. How to determine the number of hidden variables has a significant impact on the accuracy of network inference. In this study, we used SSM to infer Gene regulatory networks (GRNs) from synthetic time series datasets, investigated Bayesian Information Criterion (BIC) and Principle Component Analysis (PCA) approaches to determining the number of hidden variables in SSM, and evaluated the performance of SSM in comparison with DBN. True GRNs and synthetic gene expression datasets were generated using GeneNetWeaver. Both DBN and linear SSM were used to infer GRNs from the synthetic datasets. The inferred networks were compared with the true networks. Our results show that inference precision varied with the number of hidden variables. For some regulatory networks, the inference precision of DBN was higher but SSM performed better in other cases. Although the overall performance of the two approaches is compatible, SSM is much faster and capable of inferring much larger networks than DBN. This study provides useful information in handling the hidden variables and improving the inference precision.
Conditionals and inferential connections: A hypothetical inferential theory.
Douven, Igor; Elqayam, Shira; Singmann, Henrik; van Wijnbergen-Huitink, Janneke
2018-03-01
Intuition suggests that for a conditional to be evaluated as true, there must be some kind of connection between its component clauses. In this paper, we formulate and test a new psychological theory to account for this intuition. We combined previous semantic and psychological theorizing to propose that the key to the intuition is a relevance-driven, satisficing-bounded inferential connection between antecedent and consequent. To test our theory, we created a novel experimental paradigm in which participants were presented with a soritical series of objects, notably colored patches (Experiments 1 and 4) and spheres (Experiment 2), or both (Experiment 3), and were asked to evaluate related conditionals embodying non-causal inferential connections (such as "If patch number 5 is blue, then so is patch number 4"). All four experiments displayed a unique response pattern, in which (largely determinate) responses were sensitive to parameters determining inference strength, as well as to consequent position in the series, in a way analogous to belief bias. Experiment 3 showed that this guaranteed relevance can be suppressed, with participants reverting to the defective conditional. Experiment 4 showed that this pattern can be partly explained by a measure of inference strength. This pattern supports our theory's "principle of relevant inference" and "principle of bounded inference," highlighting the dual processing characteristics of the inferential connection. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Predictive minimum description length principle approach to inferring gene regulatory networks.
Chaitankar, Vijender; Zhang, Chaoyang; Ghosh, Preetam; Gong, Ping; Perkins, Edward J; Deng, Youping
2011-01-01
Reverse engineering of gene regulatory networks using information theory models has received much attention due to its simplicity, low computational cost, and capability of inferring large networks. One of the major problems with information theory models is to determine the threshold that defines the regulatory relationships between genes. The minimum description length (MDL) principle has been implemented to overcome this problem. The description length of the MDL principle is the sum of model length and data encoding length. A user-specified fine tuning parameter is used as control mechanism between model and data encoding, but it is difficult to find the optimal parameter. In this work, we propose a new inference algorithm that incorporates mutual information (MI), conditional mutual information (CMI), and predictive minimum description length (PMDL) principle to infer gene regulatory networks from DNA microarray data. In this algorithm, the information theoretic quantities MI and CMI determine the regulatory relationships between genes and the PMDL principle method attempts to determine the best MI threshold without the need of a user-specified fine tuning parameter. The performance of the proposed algorithm is evaluated using both synthetic time series data sets and a biological time series data set (Saccharomyces cerevisiae). The results show that the proposed algorithm produced fewer false edges and significantly improved the precision when compared to existing MDL algorithm.
Lack of bedrock grain size influence on the soil production rate
NASA Astrophysics Data System (ADS)
Gontier, Adrien; Rihs, Sophie; Chabaux, Francois; Lemarchand, Damien; Pelt, Eric; Turpault, Marie-Pierre
2015-10-01
Our study deals with the part played by bedrock grain size on soil formation rates. U- and Th-series disequilibria were measured in two soil profiles developed from two different facies of the same bedrock, i.e., fine and coarse grain size granites, in the geomorphically flat landscape of the experimental Breuil-Chenue forest site, Morvan, France. The U- and Th-series disequilibria of soil layers and the inferred soil formation rate (1-2 mm ky-1) are nearly identical along the two profiles despite differences in bedrock grain size, variable weathering states and a significant redistribution of U and Th from the uppermost soil layers. This indicates that the soil production rate is more affected by regional geomorphology than by the underlying bedrock texture. Such a production rate inferred from residual soil minerals integrated over the age of the soil is consistent with the flat and slowly eroding geomorphic landscape of the study site. It also compares well to the rate inferred from dissolved solutes integrated over the shorter time scale of solute transport from granitic and basaltic watersheds under similar climates. However, it is significantly lower than the denudation or soil formation rates previously reported from either cosmogenic isotope or U-series measurements from similar climates and lithologies. Our results highlight the particularly low soil production rates of flat terrains in temperate climates. Moreover, they provide evidence that the reactions of mineral weathering actually take place in horizons deeper than 1 m, while a chemical steady state of both concentrations and U-series disequilibria is established in the upper most soil layers, i.e., above ∼70 cm depth. In such cases, the use of soil surface horizons for determining weathering rates is precluded and illustrates the need to focus instead on the deepest soil horizons.
Introgression Makes Waves in Inferred Histories of Effective Population Size.
Hawks, John
2017-01-01
Human populations have a complex history of introgression and of changing population size. Human genetic variation has been affected by both these processes, so inference of past population size depends upon the pattern of gene flow and introgression among past populations. One remarkable aspect of human population history as inferred from genetics is a consistent "wave" of larger effective population sizes, found in both African and non-African populations, that appears to reflect events prior to the last 100,000 years. I carried out a series of simulations to investigate how introgression and gene flow from genetically divergent ancestral populations affect the inference of ancestral effective population size. Both introgression and gene flow from an extinct, genetically divergent population consistently produce a wave in the history of inferred effective population size. The time and amplitude of the wave reflect the time of origin of the genetically divergent ancestral populations and the strength of introgression or gene flow. These results demonstrate that even small fractions of introgression or gene flow from ancient populations may have visible effects on the inference of effective population size.
Identification of Boolean Network Models From Time Series Data Incorporating Prior Knowledge.
Leifeld, Thomas; Zhang, Zhihua; Zhang, Ping
2018-01-01
Motivation: Mathematical models take an important place in science and engineering. A model can help scientists to explain dynamic behavior of a system and to understand the functionality of system components. Since length of a time series and number of replicates is limited by the cost of experiments, Boolean networks as a structurally simple and parameter-free logical model for gene regulatory networks have attracted interests of many scientists. In order to fit into the biological contexts and to lower the data requirements, biological prior knowledge is taken into consideration during the inference procedure. In the literature, the existing identification approaches can only deal with a subset of possible types of prior knowledge. Results: We propose a new approach to identify Boolean networks from time series data incorporating prior knowledge, such as partial network structure, canalizing property, positive and negative unateness. Using vector form of Boolean variables and applying a generalized matrix multiplication called the semi-tensor product (STP), each Boolean function can be equivalently converted into a matrix expression. Based on this, the identification problem is reformulated as an integer linear programming problem to reveal the system matrix of Boolean model in a computationally efficient way, whose dynamics are consistent with the important dynamics captured in the data. By using prior knowledge the number of candidate functions can be reduced during the inference. Hence, identification incorporating prior knowledge is especially suitable for the case of small size time series data and data without sufficient stimuli. The proposed approach is illustrated with the help of a biological model of the network of oxidative stress response. Conclusions: The combination of efficient reformulation of the identification problem with the possibility to incorporate various types of prior knowledge enables the application of computational model inference to systems with limited amount of time series data. The general applicability of this methodological approach makes it suitable for a variety of biological systems and of general interest for biological and medical research.
Meyer, Patrick E; Lafitte, Frédéric; Bontempi, Gianluca
2008-10-29
This paper presents the R/Bioconductor package minet (version 1.1.6) which provides a set of functions to infer mutual information networks from a dataset. Once fed with a microarray dataset, the package returns a network where nodes denote genes, edges model statistical dependencies between genes and the weight of an edge quantifies the statistical evidence of a specific (e.g transcriptional) gene-to-gene interaction. Four different entropy estimators are made available in the package minet (empirical, Miller-Madow, Schurmann-Grassberger and shrink) as well as four different inference methods, namely relevance networks, ARACNE, CLR and MRNET. Also, the package integrates accuracy assessment tools, like F-scores, PR-curves and ROC-curves in order to compare the inferred network with a reference one. The package minet provides a series of tools for inferring transcriptional networks from microarray data. It is freely available from the Comprehensive R Archive Network (CRAN) as well as from the Bioconductor website.
Comparison Groups in Short Interrupted Time-Series: An Illustration Evaluating No Child Left Behind
ERIC Educational Resources Information Center
Wong, Manyee; Cook, Thomas D.; Steiner, Peter M.
2009-01-01
Interrupted time-series (ITS) are often used to assess the causal effect of a planned or even unplanned shock introduced into an on-going process. The pre-intervention slope is supposed to index the causal counterfactual, and deviations from it in mean, slope or variance are used to indicate an effect. However, a secure causal inference is only…
Inference for local autocorrelations in locally stationary models.
Zhao, Zhibiao
2015-04-01
For non-stationary processes, the time-varying correlation structure provides useful insights into the underlying model dynamics. We study estimation and inferences for local autocorrelation process in locally stationary time series. Our constructed simultaneous confidence band can be used to address important hypothesis testing problems, such as whether the local autocorrelation process is indeed time-varying and whether the local autocorrelation is zero. In particular, our result provides an important generalization of the R function acf() to locally stationary Gaussian processes. Simulation studies and two empirical applications are developed. For the global temperature series, we find that the local autocorrelations are time-varying and have a "V" shape during 1910-1960. For the S&P 500 index, we conclude that the returns satisfy the efficient-market hypothesis whereas the magnitudes of returns show significant local autocorrelations.
Martian Gullies and Groundwater: A Series of Unfortunate Exceptions
NASA Technical Reports Server (NTRS)
Treiman, A. H.
2005-01-01
Gullies are commonly inferred to represent debris flows, lubricated and carried by liquid water that flowed from underground. The inference of groundwater, based principally on the apparent initiation of gullies at specific bedrock layers, has not been considered for consistency with local geology. Here, I examine gully occurrences for: presence of impermeable layers (aquicludes) in the subsurface, that the layers not tilt away from the gully-bearing walls, and that liquid water could have been available at or above the gully elevations.
Fitting Flux Ropes to a Global MHD Solution: A Comparison of Techniques. Appendix 1
NASA Technical Reports Server (NTRS)
Riley, Pete; Linker, J. A.; Lionello, R.; Mikic, Z.; Odstrcil, D.; Hidalgo, M. A.; Cid, C.; Hu, Q.; Lepping, R. P.; Lynch, B. J.
2004-01-01
Flux rope fitting (FRF) techniques are an invaluable tool for extracting information about the properties of a subclass of CMEs in the solar wind. However, it has proven difficult to assess their accuracy since the underlying global structure of the CME cannot be independently determined from the data. In contrast, large-scale MHD simulations of CME evolution can provide both a global view as well as localized time series at specific points in space. In this study we apply 5 different fitting techniques to 2 hypothetical time series derived from MHD simulation results. Independent teams performed the analysis of the events in "blind tests", for which no information, other than the time series, was provided. F rom the results, we infer the following: (1) Accuracy decreases markedly with increasingly glancing encounters; (2) Correct identification of the boundaries of the flux rope can be a significant limiter; and (3) Results from techniques that infer global morphology must be viewed with caution. In spite of these limitations, FRF techniques remain a useful tool for describing in situ observations of flux rope CMEs.
Statistical inference of seabed sound-speed structure in the Gulf of Oman Basin.
Sagers, Jason D; Knobles, David P
2014-06-01
Addressed is the statistical inference of the sound-speed depth profile of a thick soft seabed from broadband sound propagation data recorded in the Gulf of Oman Basin in 1977. The acoustic data are in the form of time series signals recorded on a sparse vertical line array and generated by explosive sources deployed along a 280 km track. The acoustic data offer a unique opportunity to study a deep-water bottom-limited thickly sedimented environment because of the large number of time series measurements, very low seabed attenuation, and auxiliary measurements. A maximum entropy method is employed to obtain a conditional posterior probability distribution (PPD) for the sound-speed ratio and the near-surface sound-speed gradient. The multiple data samples allow for a determination of the average error constraint value required to uniquely specify the PPD for each data sample. Two complicating features of the statistical inference study are addressed: (1) the need to develop an error function that can both utilize the measured multipath arrival structure and mitigate the effects of data errors and (2) the effect of small bathymetric slopes on the structure of the bottom interacting arrivals.
Apes are intuitive statisticians.
Rakoczy, Hannes; Clüver, Annette; Saucke, Liane; Stoffregen, Nicole; Gräbener, Alice; Migura, Judith; Call, Josep
2014-04-01
Inductive learning and reasoning, as we use it both in everyday life and in science, is characterized by flexible inferences based on statistical information: inferences from populations to samples and vice versa. Many forms of such statistical reasoning have been found to develop late in human ontogeny, depending on formal education and language, and to be fragile even in adults. New revolutionary research, however, suggests that even preverbal human infants make use of intuitive statistics. Here, we conducted the first investigation of such intuitive statistical reasoning with non-human primates. In a series of 7 experiments, Bonobos, Chimpanzees, Gorillas and Orangutans drew flexible statistical inferences from populations to samples. These inferences, furthermore, were truly based on statistical information regarding the relative frequency distributions in a population, and not on absolute frequencies. Intuitive statistics in its most basic form is thus an evolutionarily more ancient rather than a uniquely human capacity. Copyright © 2014 Elsevier B.V. All rights reserved.
Inferring time derivatives including cell growth rates using Gaussian processes
NASA Astrophysics Data System (ADS)
Swain, Peter S.; Stevenson, Keiran; Leary, Allen; Montano-Gutierrez, Luis F.; Clark, Ivan B. N.; Vogel, Jackie; Pilizota, Teuta
2016-12-01
Often the time derivative of a measured variable is of as much interest as the variable itself. For a growing population of biological cells, for example, the population's growth rate is typically more important than its size. Here we introduce a non-parametric method to infer first and second time derivatives as a function of time from time-series data. Our approach is based on Gaussian processes and applies to a wide range of data. In tests, the method is at least as accurate as others, but has several advantages: it estimates errors both in the inference and in any summary statistics, such as lag times, and allows interpolation with the corresponding error estimation. As illustrations, we infer growth rates of microbial cells, the rate of assembly of an amyloid fibril and both the speed and acceleration of two separating spindle pole bodies. Our algorithm should thus be broadly applicable.
Scale Mixture Models with Applications to Bayesian Inference
NASA Astrophysics Data System (ADS)
Qin, Zhaohui S.; Damien, Paul; Walker, Stephen
2003-11-01
Scale mixtures of uniform distributions are used to model non-normal data in time series and econometrics in a Bayesian framework. Heteroscedastic and skewed data models are also tackled using scale mixture of uniform distributions.
Gopinath, Kaundinya; Krishnamurthy, Venkatagiri; Lacey, Simon; Sathian, K
2018-02-01
In a recent study Eklund et al. have shown that cluster-wise family-wise error (FWE) rate-corrected inferences made in parametric statistical method-based functional magnetic resonance imaging (fMRI) studies over the past couple of decades may have been invalid, particularly for cluster defining thresholds less stringent than p < 0.001; principally because the spatial autocorrelation functions (sACFs) of fMRI data had been modeled incorrectly to follow a Gaussian form, whereas empirical data suggest otherwise. Hence, the residuals from general linear model (GLM)-based fMRI activation estimates in these studies may not have possessed a homogenously Gaussian sACF. Here we propose a method based on the assumption that heterogeneity and non-Gaussianity of the sACF of the first-level GLM analysis residuals, as well as temporal autocorrelations in the first-level voxel residual time-series, are caused by unmodeled MRI signal from neuronal and physiological processes as well as motion and other artifacts, which can be approximated by appropriate decompositions of the first-level residuals with principal component analysis (PCA), and removed. We show that application of this method yields GLM residuals with significantly reduced spatial correlation, nearly Gaussian sACF and uniform spatial smoothness across the brain, thereby allowing valid cluster-based FWE-corrected inferences based on assumption of Gaussian spatial noise. We further show that application of this method renders the voxel time-series of first-level GLM residuals independent, and identically distributed across time (which is a necessary condition for appropriate voxel-level GLM inference), without having to fit ad hoc stochastic colored noise models. Furthermore, the detection power of individual subject brain activation analysis is enhanced. This method will be especially useful for case studies, which rely on first-level GLM analysis inferences.
Path-space variational inference for non-equilibrium coarse-grained systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Harmandaris, Vagelis, E-mail: harman@uoc.gr; Institute of Applied and Computational Mathematics; Kalligiannaki, Evangelia, E-mail: ekalligian@tem.uoc.gr
In this paper we discuss information-theoretic tools for obtaining optimized coarse-grained molecular models for both equilibrium and non-equilibrium molecular simulations. The latter are ubiquitous in physicochemical and biological applications, where they are typically associated with coupling mechanisms, multi-physics and/or boundary conditions. In general the non-equilibrium steady states are not known explicitly as they do not necessarily have a Gibbs structure. The presented approach can compare microscopic behavior of molecular systems to parametric and non-parametric coarse-grained models using the relative entropy between distributions on the path space and setting up a corresponding path-space variational inference problem. The methods can become entirelymore » data-driven when the microscopic dynamics are replaced with corresponding correlated data in the form of time series. Furthermore, we present connections and generalizations of force matching methods in coarse-graining with path-space information methods. We demonstrate the enhanced transferability of information-based parameterizations to different observables, at a specific thermodynamic point, due to information inequalities. We discuss methodological connections between information-based coarse-graining of molecular systems and variational inference methods primarily developed in the machine learning community. However, we note that the work presented here addresses variational inference for correlated time series due to the focus on dynamics. The applicability of the proposed methods is demonstrated on high-dimensional stochastic processes given by overdamped and driven Langevin dynamics of interacting particles.« less
Kimura, Shuhei; Sato, Masanao; Okada-Hatakeyama, Mariko
2013-01-01
The inference of a genetic network is a problem in which mutual interactions among genes are inferred from time-series of gene expression levels. While a number of models have been proposed to describe genetic networks, this study focuses on a mathematical model proposed by Vohradský. Because of its advantageous features, several researchers have proposed the inference methods based on Vohradský's model. When trying to analyze large-scale networks consisting of dozens of genes, however, these methods must solve high-dimensional non-linear function optimization problems. In order to resolve the difficulty of estimating the parameters of the Vohradský's model, this study proposes a new method that defines the problem as several two-dimensional function optimization problems. Through numerical experiments on artificial genetic network inference problems, we showed that, although the computation time of the proposed method is not the shortest, the method has the ability to estimate parameters of Vohradský's models more effectively with sufficiently short computation times. This study then applied the proposed method to an actual inference problem of the bacterial SOS DNA repair system, and succeeded in finding several reasonable regulations. PMID:24386175
Polling the face: prediction and consensus across cultures.
Rule, Nicholas O; Ambady, Nalini; Adams, Reginald B; Ozono, Hiroki; Nakashima, Satoshi; Yoshikawa, Sakiko; Watabe, Motoki
2010-01-01
Previous work has shown that individuals agree across cultures on the traits that they infer from faces. Previous work has also shown that inferences from faces can be predictive of important outcomes within cultures. The current research merges these two lines of work. In a series of cross-cultural studies, the authors asked American and Japanese participants to provide naïve inferences of traits from the faces of U.S. political candidates (Studies 1 and 3) and Japanese political candidates (Studies 2 and 4). Perceivers showed high agreement in their ratings of the faces, regardless of culture, and both sets of judgments were predictive of an important ecological outcome (the percentage of votes that each candidate received in the actual election). The traits predicting electoral success differed, however, depending on the targets' culture. Thus, when American and Japanese participants were asked to provide explicit inferences of how likely each candidate would be to win an election (Studies 3-4), judgments were predictive only for same-culture candidates. Attempts to infer the electoral success for the foreign culture showed evidence of self-projection. Therefore, perceivers can reliably infer predictive information from faces but require knowledge about the target's culture to make these predictions accurately.
The researcher and the consultant: from testing to probability statements.
Hamra, Ghassan B; Stang, Andreas; Poole, Charles
2015-09-01
In the first instalment of this series, Stang and Poole provided an overview of Fisher significance testing (ST), Neyman-Pearson null hypothesis testing (NHT), and their unfortunate and unintended offspring, null hypothesis significance testing. In addition to elucidating the distinction between the first two and the evolution of the third, the authors alluded to alternative models of statistical inference; namely, Bayesian statistics. Bayesian inference has experienced a revival in recent decades, with many researchers advocating for its use as both a complement and an alternative to NHT and ST. This article will continue in the direction of the first instalment, providing practicing researchers with an introduction to Bayesian inference. Our work will draw on the examples and discussion of the previous dialogue.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran.
Spectral likelihood expansions for Bayesian inference
NASA Astrophysics Data System (ADS)
Nagel, Joseph B.; Sudret, Bruno
2016-03-01
A spectral approach to Bayesian inference is presented. It pursues the emulation of the posterior probability density. The starting point is a series expansion of the likelihood function in terms of orthogonal polynomials. From this spectral likelihood expansion all statistical quantities of interest can be calculated semi-analytically. The posterior is formally represented as the product of a reference density and a linear combination of polynomial basis functions. Both the model evidence and the posterior moments are related to the expansion coefficients. This formulation avoids Markov chain Monte Carlo simulation and allows one to make use of linear least squares instead. The pros and cons of spectral Bayesian inference are discussed and demonstrated on the basis of simple applications from classical statistics and inverse modeling.
mSieve: Differential Behavioral Privacy in Time Series of Mobile Sensor Data.
Saleheen, Nazir; Chakraborty, Supriyo; Ali, Nasir; Mahbubur Rahman, Md; Hossain, Syed Monowar; Bari, Rummana; Buder, Eugene; Srivastava, Mani; Kumar, Santosh
2016-09-01
Differential privacy concepts have been successfully used to protect anonymity of individuals in population-scale analysis. Sharing of mobile sensor data, especially physiological data, raise different privacy challenges, that of protecting private behaviors that can be revealed from time series of sensor data. Existing privacy mechanisms rely on noise addition and data perturbation. But the accuracy requirement on inferences drawn from physiological data, together with well-established limits within which these data values occur, render traditional privacy mechanisms inapplicable. In this work, we define a new behavioral privacy metric based on differential privacy and propose a novel data substitution mechanism to protect behavioral privacy. We evaluate the efficacy of our scheme using 660 hours of ECG, respiration, and activity data collected from 43 participants and demonstrate that it is possible to retain meaningful utility, in terms of inference accuracy (90%), while simultaneously preserving the privacy of sensitive behaviors.
Hepatitis E in Singapore: A Case-Series and Viral Phylodynamics Study.
Teo, Esmeralda Chi-Yuan; Tan, Boon-Huan; Purdy, Michael A; Wong, Pui-San; Ting, Pei-Jun; Chang, Pik-Eu Jason; Oon, Lynette Lin-Ean; Sue, Amanda; Teo, Chong-Gee; Tan, Chee-Kiat
2017-04-01
AbstractThe incidence of hepatitis E in Singapore appears to be increasing. A retrospective case-series study of patients diagnosed with hepatitis E in a tertiary hospital from 2009 to 2013 was conducted. Of 16 cases, eight (50%) were solid-organ transplant recipients (SOTRs), and 14 (88%) were found infected by genotype 3 hepatitis E virus (HEV-3). Bayesian inferences based on HEV subgenomic sequences from seven cases suggest that HEV-3 strains were introduced to Singapore as two principal lineages. Within limitations of the study, it can be inferred that one lineage, in the 3efg clade, emerged about 83 years ago, probably originating from Japan, whereas the other, in the 3abchij clade, emerged about 40 years ago, from the United States. Establishment and subsequent transmissions of strains from these two lineages likely contribute to the current endemicity of hepatitis E in Singapore.
mSieve: Differential Behavioral Privacy in Time Series of Mobile Sensor Data
Saleheen, Nazir; Chakraborty, Supriyo; Ali, Nasir; Mahbubur Rahman, Md; Hossain, Syed Monowar; Bari, Rummana; Buder, Eugene; Srivastava, Mani; Kumar, Santosh
2016-01-01
Differential privacy concepts have been successfully used to protect anonymity of individuals in population-scale analysis. Sharing of mobile sensor data, especially physiological data, raise different privacy challenges, that of protecting private behaviors that can be revealed from time series of sensor data. Existing privacy mechanisms rely on noise addition and data perturbation. But the accuracy requirement on inferences drawn from physiological data, together with well-established limits within which these data values occur, render traditional privacy mechanisms inapplicable. In this work, we define a new behavioral privacy metric based on differential privacy and propose a novel data substitution mechanism to protect behavioral privacy. We evaluate the efficacy of our scheme using 660 hours of ECG, respiration, and activity data collected from 43 participants and demonstrate that it is possible to retain meaningful utility, in terms of inference accuracy (90%), while simultaneously preserving the privacy of sensitive behaviors. PMID:28058408
Identifying Seizure Onset Zone From the Causal Connectivity Inferred Using Directed Information
NASA Astrophysics Data System (ADS)
Malladi, Rakesh; Kalamangalam, Giridhar; Tandon, Nitin; Aazhang, Behnaam
2016-10-01
In this paper, we developed a model-based and a data-driven estimator for directed information (DI) to infer the causal connectivity graph between electrocorticographic (ECoG) signals recorded from brain and to identify the seizure onset zone (SOZ) in epileptic patients. Directed information, an information theoretic quantity, is a general metric to infer causal connectivity between time-series and is not restricted to a particular class of models unlike the popular metrics based on Granger causality or transfer entropy. The proposed estimators are shown to be almost surely convergent. Causal connectivity between ECoG electrodes in five epileptic patients is inferred using the proposed DI estimators, after validating their performance on simulated data. We then proposed a model-based and a data-driven SOZ identification algorithm to identify SOZ from the causal connectivity inferred using model-based and data-driven DI estimators respectively. The data-driven SOZ identification outperforms the model-based SOZ identification algorithm when benchmarked against visual analysis by neurologist, the current clinical gold standard. The causal connectivity analysis presented here is the first step towards developing novel non-surgical treatments for epilepsy.
Applying dynamic Bayesian networks to perturbed gene expression data.
Dojer, Norbert; Gambin, Anna; Mizera, Andrzej; Wilczyński, Bartek; Tiuryn, Jerzy
2006-05-08
A central goal of molecular biology is to understand the regulatory mechanisms of gene transcription and protein synthesis. Because of their solid basis in statistics, allowing to deal with the stochastic aspects of gene expressions and noisy measurements in a natural way, Bayesian networks appear attractive in the field of inferring gene interactions structure from microarray experiments data. However, the basic formalism has some disadvantages, e.g. it is sometimes hard to distinguish between the origin and the target of an interaction. Two kinds of microarray experiments yield data particularly rich in information regarding the direction of interactions: time series and perturbation experiments. In order to correctly handle them, the basic formalism must be modified. For example, dynamic Bayesian networks (DBN) apply to time series microarray data. To our knowledge the DBN technique has not been applied in the context of perturbation experiments. We extend the framework of dynamic Bayesian networks in order to incorporate perturbations. Moreover, an exact algorithm for inferring an optimal network is proposed and a discretization method specialized for time series data from perturbation experiments is introduced. We apply our procedure to realistic simulations data. The results are compared with those obtained by standard DBN learning techniques. Moreover, the advantages of using exact learning algorithm instead of heuristic methods are analyzed. We show that the quality of inferred networks dramatically improves when using data from perturbation experiments. We also conclude that the exact algorithm should be used when it is possible, i.e. when considered set of genes is small enough.
Guatteri, Mariagiovanna; Spudich, P.; Beroza, G.C.
2001-01-01
We consider the applicability of laboratory-derived rate- and state-variable friction laws to the dynamic rupture of the 1995 Kobe earthquake. We analyze the shear stress and slip evolution of Ide and Takeo's [1997] dislocation model, fitting the inferred stress change time histories by calculating the dynamic load and the instantaneous friction at a series of points within the rupture area. For points exhibiting a fast-weakening behavior, the Dieterich-Ruina friction law, with values of dc = 0.01-0.05 m for critical slip, fits the stress change time series well. This range of dc is 10-20 times smaller than the slip distance over which the stress is released, Dc, which previous studies have equated with the slip-weakening distance. The limited resolution and low-pass character of the strong motion inversion degrades the resolution of the frictional parameters and suggests that the actual dc is less than this value. Stress time series at points characterized by a slow-weakening behavior are well fitted by the Dieterich-Ruina friction law with values of dc ??? 0.01-0.05 m. The apparent fracture energy Gc can be estimated from waveform inversions more stably than the other friction parameters. We obtain a Gc = 1.5??106 J m-2 for the 1995 Kobe earthquake, in agreement with estimates for previous earthquakes. From this estimate and a plausible upper bound for the local rock strength we infer a lower bound for Dc of about 0.008 m. Copyright 2001 by the American Geophysical Union.
Learning time series for intelligent monitoring
NASA Technical Reports Server (NTRS)
Manganaris, Stefanos; Fisher, Doug
1994-01-01
We address the problem of classifying time series according to their morphological features in the time domain. In a supervised machine-learning framework, we induce a classification procedure from a set of preclassified examples. For each class, we infer a model that captures its morphological features using Bayesian model induction and the minimum message length approach to assign priors. In the performance task, we classify a time series in one of the learned classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. We report results from experiments in a monitoring domain of interest to NASA.
NASA Astrophysics Data System (ADS)
Nakada, Tomohiro; Takadama, Keiki; Watanabe, Shigeyoshi
This paper proposes the classification method using Bayesian analytical method to classify the time series data in the international emissions trading market depend on the agent-based simulation and compares the case with Discrete Fourier transform analytical method. The purpose demonstrates the analytical methods mapping time series data such as market price. These analytical methods have revealed the following results: (1) the classification methods indicate the distance of mapping from the time series data, it is easier the understanding and inference than time series data; (2) these methods can analyze the uncertain time series data using the distance via agent-based simulation including stationary process and non-stationary process; and (3) Bayesian analytical method can show the 1% difference description of the emission reduction targets of agent.
Estimation of tool wear length in finish milling using a fuzzy inference algorithm
NASA Astrophysics Data System (ADS)
Ko, Tae Jo; Cho, Dong Woo
1993-10-01
The geometric accuracy and surface roughness are mainly affected by the flank wear at the minor cutting edge in finish machining. A fuzzy estimator obtained by a fuzzy inference algorithm with a max-min composition rule to evaluate the minor flank wear length in finish milling is introduced. The features sensitive to minor flank wear are extracted from the dispersion analysis of a time series AR model of the feed directional acceleration of the spindle housing. Linguistic rules for fuzzy estimation are constructed using these features, and then fuzzy inferences are carried out with test data sets under various cutting conditions. The proposed system turns out to be effective for estimating minor flank wear length, and its mean error is less than 12%.
Automatic physical inference with information maximizing neural networks
NASA Astrophysics Data System (ADS)
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
NASA Astrophysics Data System (ADS)
Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng
2018-02-01
Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
Knowledge-guided fuzzy logic modeling to infer cellular signaling networks from proteomic data
Liu, Hui; Zhang, Fan; Mishra, Shital Kumar; Zhou, Shuigeng; Zheng, Jie
2016-01-01
Modeling of signaling pathways is crucial for understanding and predicting cellular responses to drug treatments. However, canonical signaling pathways curated from literature are seldom context-specific and thus can hardly predict cell type-specific response to external perturbations; purely data-driven methods also have drawbacks such as limited biological interpretability. Therefore, hybrid methods that can integrate prior knowledge and real data for network inference are highly desirable. In this paper, we propose a knowledge-guided fuzzy logic network model to infer signaling pathways by exploiting both prior knowledge and time-series data. In particular, the dynamic time warping algorithm is employed to measure the goodness of fit between experimental and predicted data, so that our method can model temporally-ordered experimental observations. We evaluated the proposed method on a synthetic dataset and two real phosphoproteomic datasets. The experimental results demonstrate that our model can uncover drug-induced alterations in signaling pathways in cancer cells. Compared with existing hybrid models, our method can model feedback loops so that the dynamical mechanisms of signaling networks can be uncovered from time-series data. By calibrating generic models of signaling pathways against real data, our method supports precise predictions of context-specific anticancer drug effects, which is an important step towards precision medicine. PMID:27774993
Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng
2018-02-01
Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
Dynamical inference: where phase synchronization and generalized synchronization meet.
Stankovski, Tomislav; McClintock, Peter V E; Stefanovska, Aneta
2014-06-01
Synchronization is a widespread phenomenon that occurs among interacting oscillatory systems. It facilitates their temporal coordination and can lead to the emergence of spontaneous order. The detection of synchronization from the time series of such systems is of great importance for the understanding and prediction of their dynamics, and several methods for doing so have been introduced. However, the common case where the interacting systems have time-variable characteristic frequencies and coupling parameters, and may also be subject to continuous external perturbation and noise, still presents a major challenge. Here we apply recent developments in dynamical Bayesian inference to tackle these problems. In particular, we discuss how to detect phase slips and the existence of deterministic coupling from measured data, and we unify the concepts of phase synchronization and general synchronization. Starting from phase or state observables, we present methods for the detection of both phase and generalized synchronization. The consistency and equivalence of phase and generalized synchronization are further demonstrated, by the analysis of time series from analog electronic simulations of coupled nonautonomous van der Pol oscillators. We demonstrate that the detection methods work equally well on numerically simulated chaotic systems. In all the cases considered, we show that dynamical Bayesian inference can clearly identify noise-induced phase slips and distinguish coherence from intrinsic coupling-induced synchronization.
Forecasting of natural gas consumption with neural network and neuro fuzzy system
NASA Astrophysics Data System (ADS)
Kaynar, Oguz; Yilmaz, Isik; Demirkoparan, Ferhan
2010-05-01
The prediction of natural gas consumption is crucial for Turkey which follows foreign-dependent policy in point of providing natural gas and whose stock capacity is only 5% of internal total consumption. Prediction accuracy of demand is one of the elements which has an influence on sectored investments and agreements about obtaining natural gas, so on development of sector. In recent years, new techniques, such as artificial neural networks and fuzzy inference systems, have been widely used in natural gas consumption prediction in addition to classical time series analysis. In this study, weekly natural gas consumption of Turkey has been predicted by means of three different approaches. The first one is Autoregressive Integrated Moving Average (ARIMA), which is classical time series analysis method. The second approach is the Artificial Neural Network. Two different ANN models, which are Multi Layer Perceptron (MLP) and Radial Basis Function Network (RBFN), are employed to predict natural gas consumption. The last is Adaptive Neuro Fuzzy Inference System (ANFIS), which combines ANN and Fuzzy Inference System. Different prediction models have been constructed and one model, which has the best forecasting performance, is determined for each method. Then predictions are made by using these models and results are compared. Keywords: ANN, ANFIS, ARIMA, Natural Gas, Forecasting
Automated adaptive inference of phenomenological dynamical models
NASA Astrophysics Data System (ADS)
Daniels, Bryan
Understanding the dynamics of biochemical systems can seem impossibly complicated at the microscopic level: detailed properties of every molecular species, including those that have not yet been discovered, could be important for producing macroscopic behavior. The profusion of data in this area has raised the hope that microscopic dynamics might be recovered in an automated search over possible models, yet the combinatorial growth of this space has limited these techniques to systems that contain only a few interacting species. We take a different approach inspired by coarse-grained, phenomenological models in physics. Akin to a Taylor series producing Hooke's Law, forgoing microscopic accuracy allows us to constrain the search over dynamical models to a single dimension. This makes it feasible to infer dynamics with very limited data, including cases in which important dynamical variables are unobserved. We name our method Sir Isaac after its ability to infer the dynamical structure of the law of gravitation given simulated planetary motion data. Applying the method to output from a microscopically complicated but macroscopically simple biological signaling model, it is able to adapt the level of detail to the amount of available data. Finally, using nematode behavioral time series data, the method discovers an effective switch between behavioral attractors after the application of a painful stimulus.
Financial Time-series Analysis: a Brief Overview
NASA Astrophysics Data System (ADS)
Chakraborti, A.; Patriarca, M.; Santhanam, M. S.
Prices of commodities or assets produce what is called time-series. Different kinds of financial time-series have been recorded and studied for decades. Nowadays, all transactions on a financial market are recorded, leading to a huge amount of data available, either for free in the Internet or commercially. Financial time-series analysis is of great interest to practitioners as well as to theoreticians, for making inferences and predictions. Furthermore, the stochastic uncertainties inherent in financial time-series and the theory needed to deal with them make the subject especially interesting not only to economists, but also to statisticians and physicists [1]. While it would be a formidable task to make an exhaustive review on the topic, with this review we try to give a flavor of some of its aspects.
Michael Swyer
2015-02-22
Global Positioning System (GPS) time series from the National Science Foundation (NSF) Earthscope’s Plate Boundary Observatory (PBO) and Central Washington University’s Pacific Northwest Geodetic Array (PANGA). GPS station velocities were used to infer strain rates using the ‘splines in tension’ method. Strain rates were derived separately for subduction zone locking at depth and block rotation near the surface within crustal block boundaries.
Zhang, Bao; Yao, Yibin; Fok, Hok Sum; Hu, Yufeng; Chen, Qiang
2016-09-19
This study uses the observed vertical displacements of Global Positioning System (GPS) time series obtained from the Crustal Movement Observation Network of China (CMONOC) with careful pre- and post-processing to estimate the seasonal crustal deformation in response to the hydrological loading in lower three-rivers headwater region of southwest China, followed by inferring the annual EWH changes through geodetic inversion methods. The Helmert Variance Component Estimation (HVCE) and the Minimum Mean Square Error (MMSE) criterion were successfully employed. The GPS inferred EWH changes agree well qualitatively with the Gravity Recovery and Climate Experiment (GRACE)-inferred and the Global Land Data Assimilation System (GLDAS)-inferred EWH changes, with a discrepancy of 3.2-3.9 cm and 4.8-5.2 cm, respectively. In the research areas, the EWH changes in the Lancang basin is larger than in the other regions, with a maximum of 21.8-24.7 cm and a minimum of 3.1-6.9 cm.
CauseMap: fast inference of causality from complex time series.
Maher, M Cyrus; Hernandez, Ryan D
2015-01-01
Background. Establishing health-related causal relationships is a central pursuit in biomedical research. Yet, the interdependent non-linearity of biological systems renders causal dynamics laborious and at times impractical to disentangle. This pursuit is further impeded by the dearth of time series that are sufficiently long to observe and understand recurrent patterns of flux. However, as data generation costs plummet and technologies like wearable devices democratize data collection, we anticipate a coming surge in the availability of biomedically-relevant time series data. Given the life-saving potential of these burgeoning resources, it is critical to invest in the development of open source software tools that are capable of drawing meaningful insight from vast amounts of time series data. Results. Here we present CauseMap, the first open source implementation of convergent cross mapping (CCM), a method for establishing causality from long time series data (≳25 observations). Compared to existing time series methods, CCM has the advantage of being model-free and robust to unmeasured confounding that could otherwise induce spurious associations. CCM builds on Takens' Theorem, a well-established result from dynamical systems theory that requires only mild assumptions. This theorem allows us to reconstruct high dimensional system dynamics using a time series of only a single variable. These reconstructions can be thought of as shadows of the true causal system. If reconstructed shadows can predict points from opposing time series, we can infer that the corresponding variables are providing views of the same causal system, and so are causally related. Unlike traditional metrics, this test can establish the directionality of causation, even in the presence of feedback loops. Furthermore, since CCM can extract causal relationships from times series of, e.g., a single individual, it may be a valuable tool to personalized medicine. We implement CCM in Julia, a high-performance programming language designed for facile technical computing. Our software package, CauseMap, is platform-independent and freely available as an official Julia package. Conclusions. CauseMap is an efficient implementation of a state-of-the-art algorithm for detecting causality from time series data. We believe this tool will be a valuable resource for biomedical research and personalized medicine.
NASA Astrophysics Data System (ADS)
Casdagli, M. C.
1997-09-01
We show that recurrence plots (RPs) give detailed characterizations of time series generated by dynamical systems driven by slowly varying external forces. For deterministic systems we show that RPs of the time series can be used to reconstruct the RP of the driving force if it varies sufficiently slowly. If the driving force is one-dimensional, its functional form can then be inferred up to an invertible coordinate transformation. The same results hold for stochastic systems if the RP of the time series is suitably averaged and transformed. These results are used to investigate the nonlinear prediction of time series generated by dynamical systems driven by slowly varying external forces. We also consider the problem of detecting a small change in the driving force, and propose a surrogate data technique for assessing statistical significance. Numerically simulated time series and a time series of respiration rates recorded from a subject with sleep apnea are used as illustrative examples.
Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi
2017-01-01
We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.
Probabilistic reconstruction of GPS vertical ground motion and comparison with GIA models
NASA Astrophysics Data System (ADS)
Husson, Laurent; Bodin, Thomas; Choblet, Gael; Kreemer, Corné
2017-04-01
The vertical position time-series of GPS stations have become long enough for many parts of the world to infer modern rates of vertical ground motion. We use the worldwide compilation of GPS trend velocities of the Nevada Geodetic Laboratory. Those rates are inferred by applying the MIDAS algorithm (Blewitt et al., 2016) to time-series obtained from publicly available data from permanent stations. Because MIDAS filters out seasonality and discontinuities, regardless of their causes, it gives robust long-term rates of vertical ground motion (except where there is significant postseismic deformation). As the stations are unevenly distributed, and because data errors are also highly variable, sometimes to an unknown degree, we use a Bayesian inference method to reconstruct 2D maps of vertical ground motion. Our models are based on a Voronoi tessellation and self-adapt to the spatially variable level of information provided by the data. Instead of providing a unique interpolated surface, each point of the reconstructed surface is defined through a probability density function. We apply our method to a series of vast regions covering entire continents. Not surprisingly, the reconstructed surface at a long wavelength is dominated by the GIA. This result can be exploited to evaluate whether forward models of GIA reproduce geodetic rates within the uncertainties derived from our interpolation, not only at high latitudes where postglacial rebound is fast, but also in more temperate latitudes where, for instance, such rates may compete with modern sea level rise. At shorter wavelengths, the reconstructed surface of vertical ground motion features a variety of identifiable patterns, whose geometries and rates can be mapped. Examples are transient dynamic topography over the convecting mantle, actively deforming domains (mountain belts and active margins), volcanic areas, or anthropogenic contributions.
Bayesian structural inference for hidden processes.
Strelioff, Christopher C; Crutchfield, James P
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ε-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ε-machines, irrespective of estimated transition probabilities. Properties of ε-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Bayesian structural inference for hidden processes
NASA Astrophysics Data System (ADS)
Strelioff, Christopher C.; Crutchfield, James P.
2014-04-01
We introduce a Bayesian approach to discovering patterns in structurally complex processes. The proposed method of Bayesian structural inference (BSI) relies on a set of candidate unifilar hidden Markov model (uHMM) topologies for inference of process structure from a data series. We employ a recently developed exact enumeration of topological ɛ-machines. (A sequel then removes the topological restriction.) This subset of the uHMM topologies has the added benefit that inferred models are guaranteed to be ɛ-machines, irrespective of estimated transition probabilities. Properties of ɛ-machines and uHMMs allow for the derivation of analytic expressions for estimating transition probabilities, inferring start states, and comparing the posterior probability of candidate model topologies, despite process internal structure being only indirectly present in data. We demonstrate BSI's effectiveness in estimating a process's randomness, as reflected by the Shannon entropy rate, and its structure, as quantified by the statistical complexity. We also compare using the posterior distribution over candidate models and the single, maximum a posteriori model for point estimation and show that the former more accurately reflects uncertainty in estimated values. We apply BSI to in-class examples of finite- and infinite-order Markov processes, as well to an out-of-class, infinite-state hidden process.
Event-related potential correlates of emergent inference in human arbitrary relational learning.
Wang, Ting; Dymond, Simon
2013-01-01
Two experiments investigated the functional-anatomical correlates of cognition supporting untrained, emergent relational inference in a stimulus equivalence task. In Experiment 1, after learning a series of conditional relations involving words and pseudowords, participants performed a relatedness task during which EEG was recorded. Behavioural performance was faster and more accurate on untrained, indirectly related symmetry (i.e., learn AB and infer BA) and equivalence trials (i.e., learn AB and AC and infer CB) than on unrelated trials, regardless of whether or not a formal test for stimulus equivalence relations had been conducted. Consistent with previous results, event related potentials (ERPs) evoked by trained and emergent trials at parietal and occipital sites differed only for those participants who had not received a prior equivalence test. Experiment 2 further replicated and extended these behavioural and ERP findings using arbitrary symbols as stimuli and demonstrated time and frequency differences for trained and untrained relatedness trials. Overall, the findings demonstrate convincingly the ERP correlates of intra-experimentally established stimulus equivalence relations consisting entirely of arbitrary symbols and offer support for a contemporary cognitive-behavioural model of symbolic categorisation and relational inference. Copyright © 2012 Elsevier B.V. All rights reserved.
Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V E; Stefanovska, Aneta
2012-12-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski et al. [Phys. Rev. Lett. 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Prophetic Granger Causality to infer gene regulatory networks.
Carlin, Daniel E; Paull, Evan O; Graim, Kiley; Wong, Christopher K; Bivol, Adrian; Ryabinin, Peter; Ellrott, Kyle; Sokolov, Artem; Stuart, Joshua M
2017-01-01
We introduce a novel method called Prophetic Granger Causality (PGC) for inferring gene regulatory networks (GRNs) from protein-level time series data. The method uses an L1-penalized regression adaptation of Granger Causality to model protein levels as a function of time, stimuli, and other perturbations. When combined with a data-independent network prior, the framework outperformed all other methods submitted to the HPN-DREAM 8 breast cancer network inference challenge. Our investigations reveal that PGC provides complementary information to other approaches, raising the performance of ensemble learners, while on its own achieves moderate performance. Thus, PGC serves as a valuable new tool in the bioinformatics toolkit for analyzing temporal datasets. We investigate the general and cell-specific interactions predicted by our method and find several novel interactions, demonstrating the utility of the approach in charting new tumor wiring.
Prophetic Granger Causality to infer gene regulatory networks
Carlin, Daniel E.; Paull, Evan O.; Graim, Kiley; Wong, Christopher K.; Bivol, Adrian; Ryabinin, Peter; Ellrott, Kyle; Sokolov, Artem
2017-01-01
We introduce a novel method called Prophetic Granger Causality (PGC) for inferring gene regulatory networks (GRNs) from protein-level time series data. The method uses an L1-penalized regression adaptation of Granger Causality to model protein levels as a function of time, stimuli, and other perturbations. When combined with a data-independent network prior, the framework outperformed all other methods submitted to the HPN-DREAM 8 breast cancer network inference challenge. Our investigations reveal that PGC provides complementary information to other approaches, raising the performance of ensemble learners, while on its own achieves moderate performance. Thus, PGC serves as a valuable new tool in the bioinformatics toolkit for analyzing temporal datasets. We investigate the general and cell-specific interactions predicted by our method and find several novel interactions, demonstrating the utility of the approach in charting new tumor wiring. PMID:29211761
Some Recent Developments on Complex Multivariate Distributions
ERIC Educational Resources Information Center
Krishnaiah, P. R.
1976-01-01
In this paper, the author gives a review of the literature on complex multivariate distributions. Some new results on these distributions are also given. Finally, the author discusses the applications of the complex multivariate distributions in the area of the inference on multiple time series. (Author)
Testing for Granger Causality in the Frequency Domain: A Phase Resampling Method.
Liu, Siwei; Molenaar, Peter
2016-01-01
This article introduces phase resampling, an existing but rarely used surrogate data method for making statistical inferences of Granger causality in frequency domain time series analysis. Granger causality testing is essential for establishing causal relations among variables in multivariate dynamic processes. However, testing for Granger causality in the frequency domain is challenging due to the nonlinear relation between frequency domain measures (e.g., partial directed coherence, generalized partial directed coherence) and time domain data. Through a simulation study, we demonstrate that phase resampling is a general and robust method for making statistical inferences even with short time series. With Gaussian data, phase resampling yields satisfactory type I and type II error rates in all but one condition we examine: when a small effect size is combined with an insufficient number of data points. Violations of normality lead to slightly higher error rates but are mostly within acceptable ranges. We illustrate the utility of phase resampling with two empirical examples involving multivariate electroencephalography (EEG) and skin conductance data.
Rapid variability of Antarctic Bottom Water transport into the Pacific Ocean inferred from GRACE
NASA Astrophysics Data System (ADS)
Mazloff, Matthew R.; Boening, Carmen
2016-04-01
Air-ice-ocean interactions in the Antarctic lead to formation of the densest waters on Earth. These waters convect and spread to fill the global abyssal oceans. The heat and carbon storage capacity of these water masses, combined with their abyssal residence times that often exceed centuries, makes this circulation pathway the most efficient sequestering mechanism on Earth. Yet monitoring this pathway has proven challenging due to the nature of the formation processes and the depth of the circulation. The Gravity Recovery and Climate Experiment (GRACE) gravity mission is providing a time series of ocean mass redistribution and offers a transformative view of the abyssal circulation. Here we use the GRACE measurements to infer, for the first time, a 2003-2014 time series of Antarctic Bottom Water export into the South Pacific. We find this export highly variable, with a standard deviation of 1.87 sverdrup (Sv) and a decorrelation timescale of less than 1 month. A significant trend is undetectable.
NASA Astrophysics Data System (ADS)
Lynne, Bridget Y.; Heasler, Henry; Jaworowski, Cheryl; Smith, Gary J.; Smith, Isaac J.; Foley, Duncan
2018-04-01
In April 2015, Ground Penetrating Radar (GPR) was used to characterize the shallow subsurface (< 5 m depth) of the western sinter slope immediately adjacent to Old Faithful Geyser and near the north side of an inferred geyser cavity. A series of time-sequence images were collected between two eruptions of Old Faithful Geyser. Each set of time-sequence GPR recordings consisted of four transects aligned to provide coverage near the potential location of the inferred 15 m deep geyser chamber. However, the deepest penetration we could achieve with a 200 MHz GPR antennae was 5 m. Seven time-sequence events were collected over a 48-minute interval to image changes in the near-surface, during pre- and post-eruptive cycles. Time-sequence GPR images revealed a series of possible micro-fractures in a highly porous siliceous sinter in the near-surface that fill and drain repetitively, immediately after an eruption and during the recharge period prior to the next main eruptive event.
A time series approach to inferring groundwater recharge using the water table fluctuation method
NASA Astrophysics Data System (ADS)
Crosbie, Russell S.; Binning, Philip; Kalma, Jetse D.
2005-01-01
The water table fluctuation method for determining recharge from precipitation and water table measurements was originally developed on an event basis. Here a new multievent time series approach is presented for inferring groundwater recharge from long-term water table and precipitation records. Additional new features are the incorporation of a variable specific yield based upon the soil moisture retention curve, proper accounting for the Lisse effect on the water table, and the incorporation of aquifer drainage so that recharge can be detected even if the water table does not rise. A methodology for filtering noise and non-rainfall-related water table fluctuations is also presented. The model has been applied to 2 years of field data collected in the Tomago sand beds near Newcastle, Australia. It is shown that gross recharge estimates are very sensitive to time step size and specific yield. Properly accounting for the Lisse effect is also important to determining recharge.
Owl Pellet Analysis--A Useful Tool in Field Studies
ERIC Educational Resources Information Center
Medlin, G. C.
1977-01-01
Describes a technique by which the density and hunting habits of owls can be inferred from their pellets. Owl pellets--usually small, cylindrical packages of undigested bone, hair, etc.--are regurgitated by a roosting bird. A series of activities based on owl pellets are provided. (CP)
Research Methods in School Psychology: An Overview.
ERIC Educational Resources Information Center
Keith, Timothy Z.
1988-01-01
This article introduces a mini-series on research methods in school psychology. A conceptual overview of research methods is presented, emphasizing the degree to which each method allows the inference that treatment affects outcome. Experimental and nonexperimental, psychometric, descriptive, and meta-analysis research methods are outlined. (SLD)
NASA Technical Reports Server (NTRS)
Pitts, Michael C.; Thomason, L. W.; Poole, Lamont R.; Winker, David M.
2007-01-01
The role of polar stratospheric clouds in polar ozone loss has been well documented. The CALIPSO satellite mission offers a new opportunity to characterize PSCs on spatial and temporal scales previously unavailable. A PSC detection algorithm based on a single wavelength threshold approach has been developed for CALIPSO. The method appears to accurately detect PSCs of all opacities, including tenuous clouds, with a very low rate of false positives and few missed clouds. We applied the algorithm to CALIPSO data acquired during the 2006 Antarctic winter season from 13 June through 31 October. The spatial and temporal distribution of CALIPSO PSC observations is illustrated with weekly maps of PSC occurrence. The evolution of the 2006 PSC season is depicted by time series of daily PSC frequency as a function of altitude. Comparisons with virtual solar occultation data indicate that CALIPSO provides a different view of the PSC season than attained with previous solar occultation satellites. Measurement-based time series of PSC areal coverage and vertically-integrated PSC volume are computed from the CALIPSO data. The observed area covered with PSCs is significantly smaller than would be inferred from a temperature-based proxy such as TNAT but is similar in magnitude to that inferred from TSTS. The potential of CALIPSO measurements for investigating PSC microphysics is illustrated using combinations of lidar backscatter coefficient and volume depolarization to infer composition for two CALIPSO PSC scenes.
Model-free information-theoretic approach to infer leadership in pairs of zebrafish.
Butail, Sachit; Mwaffo, Violet; Porfiri, Maurizio
2016-04-01
Collective behavior affords several advantages to fish in avoiding predators, foraging, mating, and swimming. Although fish schools have been traditionally considered egalitarian superorganisms, a number of empirical observations suggest the emergence of leadership in gregarious groups. Detecting and classifying leader-follower relationships is central to elucidate the behavioral and physiological causes of leadership and understand its consequences. Here, we demonstrate an information-theoretic approach to infer leadership from positional data of fish swimming. In this framework, we measure social interactions between fish pairs through the mathematical construct of transfer entropy, which quantifies the predictive power of a time series to anticipate another, possibly coupled, time series. We focus on the zebrafish model organism, which is rapidly emerging as a species of choice in preclinical research for its genetic similarity to humans and reduced neurobiological complexity with respect to mammals. To overcome experimental confounds and generate test data sets on which we can thoroughly assess our approach, we adapt and calibrate a data-driven stochastic model of zebrafish motion for the simulation of a coupled dynamical system of zebrafish pairs. In this synthetic data set, the extent and direction of the coupling between the fish are systematically varied across a wide parameter range to demonstrate the accuracy and reliability of transfer entropy in inferring leadership. Our approach is expected to aid in the analysis of collective behavior, providing a data-driven perspective to understand social interactions.
Model-free information-theoretic approach to infer leadership in pairs of zebrafish
NASA Astrophysics Data System (ADS)
Butail, Sachit; Mwaffo, Violet; Porfiri, Maurizio
2016-04-01
Collective behavior affords several advantages to fish in avoiding predators, foraging, mating, and swimming. Although fish schools have been traditionally considered egalitarian superorganisms, a number of empirical observations suggest the emergence of leadership in gregarious groups. Detecting and classifying leader-follower relationships is central to elucidate the behavioral and physiological causes of leadership and understand its consequences. Here, we demonstrate an information-theoretic approach to infer leadership from positional data of fish swimming. In this framework, we measure social interactions between fish pairs through the mathematical construct of transfer entropy, which quantifies the predictive power of a time series to anticipate another, possibly coupled, time series. We focus on the zebrafish model organism, which is rapidly emerging as a species of choice in preclinical research for its genetic similarity to humans and reduced neurobiological complexity with respect to mammals. To overcome experimental confounds and generate test data sets on which we can thoroughly assess our approach, we adapt and calibrate a data-driven stochastic model of zebrafish motion for the simulation of a coupled dynamical system of zebrafish pairs. In this synthetic data set, the extent and direction of the coupling between the fish are systematically varied across a wide parameter range to demonstrate the accuracy and reliability of transfer entropy in inferring leadership. Our approach is expected to aid in the analysis of collective behavior, providing a data-driven perspective to understand social interactions.
2011-01-01
Background Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. Results Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. Conclusions Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices. PMID:21771321
Jang, Sumin; Choubey, Sandeep; Furchtgott, Leon; Zou, Ling-Nan; Doyle, Adele; Menon, Vilas; Loew, Ethan B; Krostag, Anne-Rachel; Martinez, Refugio A; Madisen, Linda; Levi, Boaz P; Ramanathan, Sharad
2017-01-01
The complexity of gene regulatory networks that lead multipotent cells to acquire different cell fates makes a quantitative understanding of differentiation challenging. Using a statistical framework to analyze single-cell transcriptomics data, we infer the gene expression dynamics of early mouse embryonic stem (mES) cell differentiation, uncovering discrete transitions across nine cell states. We validate the predicted transitions across discrete states using flow cytometry. Moreover, using live-cell microscopy, we show that individual cells undergo abrupt transitions from a naïve to primed pluripotent state. Using the inferred discrete cell states to build a probabilistic model for the underlying gene regulatory network, we further predict and experimentally verify that these states have unique response to perturbations, thus defining them functionally. Our study provides a framework to infer the dynamics of differentiation from single cell transcriptomics data and to build predictive models of the gene regulatory networks that drive the sequence of cell fate decisions during development. DOI: http://dx.doi.org/10.7554/eLife.20487.001 PMID:28296635
Inference of financial networks using the normalised mutual information rate.
Goh, Yong Kheng; Hasim, Haslifah M; Antonopoulos, Chris G
2018-01-01
In this paper, we study data from financial markets, using the normalised Mutual Information Rate. We show how to use it to infer the underlying network structure of interrelations in the foreign currency exchange rates and stock indices of 15 currency areas. We first present the mathematical method and discuss its computational aspects, and apply it to artificial data from chaotic dynamics and to correlated normal-variates data. We then apply the method to infer the structure of the financial system from the time-series of currency exchange rates and stock indices. In particular, we study and reveal the interrelations among the various foreign currency exchange rates and stock indices in two separate networks, of which we also study their structural properties. Our results show that both inferred networks are small-world networks, sharing similar properties and having differences in terms of assortativity. Importantly, our work shows that global economies tend to connect with other economies world-wide, rather than creating small groups of local economies. Finally, the consistent interrelations depicted among the 15 currency areas are further supported by a discussion from the viewpoint of economics.
Zhang, Bao; Yao, Yibin; Fok, Hok Sum; Hu, Yufeng; Chen, Qiang
2016-01-01
This study uses the observed vertical displacements of Global Positioning System (GPS) time series obtained from the Crustal Movement Observation Network of China (CMONOC) with careful pre- and post-processing to estimate the seasonal crustal deformation in response to the hydrological loading in lower three-rivers headwater region of southwest China, followed by inferring the annual EWH changes through geodetic inversion methods. The Helmert Variance Component Estimation (HVCE) and the Minimum Mean Square Error (MMSE) criterion were successfully employed. The GPS inferred EWH changes agree well qualitatively with the Gravity Recovery and Climate Experiment (GRACE)-inferred and the Global Land Data Assimilation System (GLDAS)-inferred EWH changes, with a discrepancy of 3.2–3.9 cm and 4.8–5.2 cm, respectively. In the research areas, the EWH changes in the Lancang basin is larger than in the other regions, with a maximum of 21.8–24.7 cm and a minimum of 3.1–6.9 cm. PMID:27657064
Effects of spatial training on transitive inference performance in humans and rhesus monkeys
Gazes, Regina Paxton; Lazareva, Olga F.; Bergene, Clara N.; Hampton, Robert R.
2015-01-01
It is often suggested that transitive inference (TI; if A>B and B>C then A>C) involves mentally representing overlapping pairs of stimuli in a spatial series. However, there is little direct evidence to unequivocally determine the role of spatial representation in TI. We tested whether humans and rhesus monkeys use spatial representations in TI by training them to organize seven images in a vertical spatial array. Then, we presented subjects with a TI task using these same images. The implied TI order was either congruent or incongruent with the order of the trained spatial array. Humans in the congruent condition learned premise pairs more quickly, and were faster and more accurate in critical probe tests, suggesting that the spatial arrangement of images learned during spatial training influenced subsequent TI performance. Monkeys first trained in the congruent condition also showed higher test trial accuracy when the spatial and inferred orders were congruent. These results directly support the hypothesis that humans solve TI problems by spatial organization, and suggest that this cognitive mechanism for inference may have ancient evolutionary roots. PMID:25546105
Inference of financial networks using the normalised mutual information rate
2018-01-01
In this paper, we study data from financial markets, using the normalised Mutual Information Rate. We show how to use it to infer the underlying network structure of interrelations in the foreign currency exchange rates and stock indices of 15 currency areas. We first present the mathematical method and discuss its computational aspects, and apply it to artificial data from chaotic dynamics and to correlated normal-variates data. We then apply the method to infer the structure of the financial system from the time-series of currency exchange rates and stock indices. In particular, we study and reveal the interrelations among the various foreign currency exchange rates and stock indices in two separate networks, of which we also study their structural properties. Our results show that both inferred networks are small-world networks, sharing similar properties and having differences in terms of assortativity. Importantly, our work shows that global economies tend to connect with other economies world-wide, rather than creating small groups of local economies. Finally, the consistent interrelations depicted among the 15 currency areas are further supported by a discussion from the viewpoint of economics. PMID:29420644
Minas, Giorgos; Momiji, Hiroshi; Jenkins, Dafyd J; Costa, Maria J; Rand, David A; Finkenstädt, Bärbel
2017-06-26
Given the development of high-throughput experimental techniques, an increasing number of whole genome transcription profiling time series data sets, with good temporal resolution, are becoming available to researchers. The ReTrOS toolbox (Reconstructing Transcription Open Software) provides MATLAB-based implementations of two related methods, namely ReTrOS-Smooth and ReTrOS-Switch, for reconstructing the temporal transcriptional activity profile of a gene from given mRNA expression time series or protein reporter time series. The methods are based on fitting a differential equation model incorporating the processes of transcription, translation and degradation. The toolbox provides a framework for model fitting along with statistical analyses of the model with a graphical interface and model visualisation. We highlight several applications of the toolbox, including the reconstruction of the temporal cascade of transcriptional activity inferred from mRNA expression data and protein reporter data in the core circadian clock in Arabidopsis thaliana, and how such reconstructed transcription profiles can be used to study the effects of different cell lines and conditions. The ReTrOS toolbox allows users to analyse gene and/or protein expression time series where, with appropriate formulation of prior information about a minimum of kinetic parameters, in particular rates of degradation, users are able to infer timings of changes in transcriptional activity. Data from any organism and obtained from a range of technologies can be used as input due to the flexible and generic nature of the model and implementation. The output from this software provides a useful analysis of time series data and can be incorporated into further modelling approaches or in hypothesis generation.
An Excel sheet for inferring children's number-knower levels from give-N data.
Negen, James; Sarnecka, Barbara W; Lee, Michael D
2012-03-01
Number-knower levels are a series of stages of number concept development in early childhood. A child's number-knower level is typically assessed using the give-N task. Although the task procedure has been highly refined, the standard ways of analyzing give-N data remain somewhat crude. Lee and Sarnecka (Cogn Sci 34:51-67, 2010, in press) have developed a Bayesian model of children's performance on the give-N task that allows knower level to be inferred in a more principled way. However, this model requires considerable expertise and computational effort to implement and apply to data. Here, we present an approximation to the model's inference that can be computed with Microsoft Excel. We demonstrate the accuracy of the approximation and provide instructions for its use. This makes the powerful inferential capabilities of the Bayesian model accessible to developmental researchers interested in estimating knower levels from give-N data.
NASA Astrophysics Data System (ADS)
Duggento, Andrea; Stankovski, Tomislav; McClintock, Peter V. E.; Stefanovska, Aneta
2012-12-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski [Phys. Rev. Lett.PRLTAO0031-900710.1103/PhysRevLett.109.024101 109, 024101 (2012)] introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time-evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically generated data, data from an analog electronic circuit, and cardiorespiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Introducing Analysis of Conflict Theory Into the Social Science Classroom.
ERIC Educational Resources Information Center
Harris, Thomas E.
The paper provides a simplified introduction to conflict theory through a series of in-class exercises. Conflict resolution, defined as negotiated settlement, can occur through three forms of communication: tacit, implicit, and explicit. Tacit communication, taking place without face-to-face or written interaction, refers to inferences made and…
Automatic Diagnosis of Fetal Heart Rate: Comparison of Different Methodological Approaches
2001-10-25
Apgar score). Each recording lasted at least 30 minutes and it contained both the cardiographic series and the toco trace. We focused on four...inference rules automatically generated by the learning procedure showed that n° Rules can be manually reduced to 37 without deteriorating so much the
A Story-Based Simulation for Teaching Sampling Distributions
ERIC Educational Resources Information Center
Turner, Stephen; Dabney, Alan R.
2015-01-01
Statistical inference relies heavily on the concept of sampling distributions. However, sampling distributions are difficult to teach. We present a series of short animations that are story-based, with associated assessments. We hope that our contribution can be useful as a tool to teach sampling distributions in the introductory statistics…
NASA Astrophysics Data System (ADS)
Corzo, H. H.; Velasco, A. M.; Lavín, C.; Ortiz, J. V.
2018-02-01
Vertical excitation energies belonging to several Rydberg series of MgH have been inferred from 3+ electron-propagator calculations of the electron affinities of MgH+ and are in close agreement with experiment. Many electronically excited states with n > 3 are reported for the first time and new insight is given on the assignment of several Rydberg series. Valence and Rydberg excited states of MgH are distinguished respectively by high and low pole strengths corresponding to Dyson orbitals of electron attachment to the cation. By applying the Molecular Quantum Defect Orbital method, oscillator strengths for electronic transitions involving Rydberg states also have been determined.
MODIS and GIMMS Inferred Northern Hemisphere Spring Greenup in Responses to Preseason Climate
NASA Astrophysics Data System (ADS)
Xu, X.; Riley, W. J.; Koven, C.; Jia, G.
2017-12-01
We compare the discrepancies in Normalized Difference Vegetation Index (NDVI) inferred spring greenup (SG) between Terra Moderate Resolution Imaging Spectroradiometer (MODIS) and Advanced Very High Resolution Radiometer (AVHRR) instruments carried by the Global Inventory Monitoring and Modeling Studies (GIMMS) in North Hemisphere. The interannual variation of SG inferred by MODIS and GIMMS NDVI is well correlated in the mid to high latitudes. However, the presence of NDVI discrepancies leads to discrepancies in SG with remarkable latitudinal characteristics. MODIS NDVI inferred later SG in the high latitude while earlier SG in the mid to low latitudes, in comparison to GIMMS NDVI inferred SG. MODIS NDVI inferred SG is better correlated to preseason climate. Interannual variation of SG is only sensitive to preseason temperature. The GIMMS SG to temperature sensitivity over two periods implied that the inter-biome SG to temperature sensitivity is relatively stable, but SG to temperature sensitivity decreased over time. Over the same period, MODIS SG to temperature sensitivity is much higher than GIMMS. This decreased sensitivity demonstrated the findings from previous studies with continuous GIMMS NDVI analysis that vegetation growth (indicated by growing season NDVI) to temperature sensitivity is reduced over time and SG advance trend ceased after 2000s. Our results also explained the contradictive findings that SG advance accelerated after 2000s according to the merged GIMMS and MODIS NDVI time series. Despite the found discrepancies, without ground data support, the quality of NDVI and its inferred SG cannot be effectively evaluated. The discrepancies and uncertainties in different NDVI products and its inferred SG may bias the scientific significance of climate-vegetation relationship. The different NDVI products when used together should be first evaluated and harmonized.
Geoelectrical inference of mass transfer parameters using temporal moments
Day-Lewis, Frederick D.; Singha, Kamini
2008-01-01
We present an approach to infer mass transfer parameters based on (1) an analytical model that relates the temporal moments of mobile and bulk concentration and (2) a bicontinuum modification to Archie's law. Whereas conventional geochemical measurements preferentially sample from the mobile domain, electrical resistivity tomography (ERT) is sensitive to bulk electrical conductivity and, thus, electrolytic solute in both the mobile and immobile domains. We demonstrate the new approach, in which temporal moments of collocated mobile domain conductivity (i.e., conventional sampling) and ERT‐estimated bulk conductivity are used to calculate heterogeneous mass transfer rate and immobile porosity fractions in a series of numerical column experiments.
An algebra-based method for inferring gene regulatory networks.
Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard
2014-03-26
The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the dynamic patterns present in the network. Boolean polynomial dynamical systems provide a powerful modeling framework for the reverse engineering of gene regulatory networks, that enables a rich mathematical structure on the model search space. A C++ implementation of the method, distributed under LPGL license, is available, together with the source code, at http://www.paola-vera-licona.net/Software/EARevEng/REACT.html.
FBST for Cointegration Problems
NASA Astrophysics Data System (ADS)
Diniz, M.; Pereira, C. A. B.; Stern, J. M.
2008-11-01
In order to estimate causal relations, the time series econometrics has to be aware of spurious correlation, a problem first mentioned by Yule [21]. To solve the problem, one can work with differenced series or use multivariate models like VAR or VEC models. In this case, the analysed series are going to present a long run relation i.e. a cointegration relation. Even though the Bayesian literature about inference on VAR/VEC models is quite advanced, Bauwens et al. [2] highlight that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results." This paper presents the Full Bayesian Significance Test applied to cointegration rank selection tests in multivariate (VAR/VEC) time series models and shows how to implement it using available in the literature and simulated data sets. A standard non-informative prior is assumed.
Lee, E Henry; Wickham, Charlotte; Beedlow, Peter A; Waschmann, Ronald S; Tingey, David T
2017-10-01
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for climate and forest disturbances (i.e., pests, diseases, fire). The statistical method is illustrated with a tree-ring width time series for a mature closed-canopy Douglas-fir stand on the west slopes of the Cascade Mountains of Oregon, USA that is impacted by Swiss needle cast disease caused by the foliar fungus, Phaecryptopus gaeumannii (Rhode) Petrak. The likelihood-based TSIA method is proposed for the field of dendrochronology to understand the interaction of temperature, water, and forest disturbances that are important in forest ecology and climate change studies.
Statistical inference for classification of RRIM clone series using near IR reflectance properties
NASA Astrophysics Data System (ADS)
Ismail, Faridatul Aima; Madzhi, Nina Korlina; Hashim, Hadzli; Abdullah, Noor Ezan; Khairuzzaman, Noor Aishah; Azmi, Azrie Faris Mohd; Sampian, Ahmad Faiz Mohd; Harun, Muhammad Hafiz
2015-08-01
RRIM clone is a rubber breeding series produced by RRIM (Rubber Research Institute of Malaysia) through "rubber breeding program" to improve latex yield and producing clones attractive to farmers. The objective of this work is to analyse measurement of optical sensing device on latex of selected clone series. The device using transmitting NIR properties and its reflectance is converted in terms of voltage. The obtained reflectance index value via voltage was analyzed using statistical technique in order to find out the discrimination among the clones. From the statistical results using error plots and one-way ANOVA test, there is an overwhelming evidence showing discrimination of RRIM 2002, RRIM 2007 and RRIM 3001 clone series with p value = 0.000. RRIM 2008 cannot be discriminated with RRIM 2014; however both of these groups are distinct from the other clones.
A robust interrupted time series model for analyzing complex health care intervention data.
Cruz, Maricela; Bender, Miriam; Ombao, Hernando
2017-12-20
Current health policy calls for greater use of evidence-based care delivery services to improve patient quality and safety outcomes. Care delivery is complex, with interacting and interdependent components that challenge traditional statistical analytic techniques, in particular, when modeling a time series of outcomes data that might be "interrupted" by a change in a particular method of health care delivery. Interrupted time series (ITS) is a robust quasi-experimental design with the ability to infer the effectiveness of an intervention that accounts for data dependency. Current standardized methods for analyzing ITS data do not model changes in variation and correlation following the intervention. This is a key limitation since it is plausible for data variability and dependency to change because of the intervention. Moreover, present methodology either assumes a prespecified interruption time point with an instantaneous effect or removes data for which the effect of intervention is not fully realized. In this paper, we describe and develop a novel robust interrupted time series (robust-ITS) model that overcomes these omissions and limitations. The robust-ITS model formally performs inference on (1) identifying the change point; (2) differences in preintervention and postintervention correlation; (3) differences in the outcome variance preintervention and postintervention; and (4) differences in the mean preintervention and postintervention. We illustrate the proposed method by analyzing patient satisfaction data from a hospital that implemented and evaluated a new nursing care delivery model as the intervention of interest. The robust-ITS model is implemented in an R Shiny toolbox, which is freely available to the community. Copyright © 2017 John Wiley & Sons, Ltd.
Ethnic-Racial Attitudes, Images, and Behavior by Verbal Associations. Technical Report.
ERIC Educational Resources Information Center
Szalay, Lorand B.; And Others
The investigations focused on two main subject areas. The first series of experiments explored the validity of verbal association based inferences as an attitude measure and predictor of behavior. When compared with paper-and-pencil methods, the association based attitude index (EDI) showed high positive correlation as a group measure and medium…
Computer-Based Instruction in Statistical Inference; Final Report. Technical Memorandum (TM Series).
ERIC Educational Resources Information Center
Rosenbaum, J.; And Others
A two-year investigation into the development of computer-assisted instruction (CAI) for the improvement of undergraduate training in statistics was undertaken. The first year was largely devoted to designing PLANIT (Programming LANguage for Interactive Teaching) which reduces, or completely eliminates, the need an author of CAI lessons would…
Observing and Producing Sounds, Elementary School Science, Level Four, Teaching Manual.
ERIC Educational Resources Information Center
Hale, Helen E.
This pilot teaching unit is one of a series developed for use in elementary school science programs. This unit is designed to help children discover specific concepts which relate to sound, such as volume, pitch, and echo. The student activities employ important scientific processes, such as observation, communication, inference, classification,…
Convergent cross-mapping and pairwise asymmetric inference.
McCracken, James M; Weigel, Robert S
2014-12-01
Convergent cross-mapping (CCM) is a technique for computing specific kinds of correlations between sets of times series. It was introduced by Sugihara et al. [Science 338, 496 (2012).] and is reported to be "a necessary condition for causation" capable of distinguishing causality from standard correlation. We show that the relationships between CCM correlations proposed by Sugihara et al. do not, in general, agree with intuitive concepts of "driving" and as such should not be considered indicative of causality. It is shown that the fact that the CCM algorithm implies causality is a function of system parameters for simple linear and nonlinear systems. For example, in a circuit containing a single resistor and inductor, both voltage and current can be identified as the driver depending on the frequency of the source voltage. It is shown that the CCM algorithm, however, can be modified to identify relationships between pairs of time series that are consistent with intuition for the considered example systems for which CCM causality analysis provided nonintuitive driver identifications. This modification of the CCM algorithm is introduced as "pairwise asymmetric inference" (PAI) and examples of its use are presented.
LASSIM-A network inference toolbox for genome-wide mechanistic modeling.
Magnusson, Rasmus; Mariotti, Guido Pio; Köpsén, Mattias; Lövfors, William; Gawel, Danuta R; Jörnsten, Rebecka; Linde, Jörg; Nordling, Torbjörn E M; Nyman, Elin; Schulze, Sylvie; Nestor, Colm E; Zhang, Huan; Cedersund, Gunnar; Benson, Mikael; Tjärnberg, Andreas; Gustafsson, Mika
2017-06-01
Recent technological advancements have made time-resolved, quantitative, multi-omics data available for many model systems, which could be integrated for systems pharmacokinetic use. Here, we present large-scale simulation modeling (LASSIM), which is a novel mathematical tool for performing large-scale inference using mechanistically defined ordinary differential equations (ODE) for gene regulatory networks (GRNs). LASSIM integrates structural knowledge about regulatory interactions and non-linear equations with multiple steady state and dynamic response expression datasets. The rationale behind LASSIM is that biological GRNs can be simplified using a limited subset of core genes that are assumed to regulate all other gene transcription events in the network. The LASSIM method is implemented as a general-purpose toolbox using the PyGMO Python package to make the most of multicore computers and high performance clusters, and is available at https://gitlab.com/Gustafsson-lab/lassim. As a method, LASSIM works in two steps, where it first infers a non-linear ODE system of the pre-specified core gene expression. Second, LASSIM in parallel optimizes the parameters that model the regulation of peripheral genes by core system genes. We showed the usefulness of this method by applying LASSIM to infer a large-scale non-linear model of naïve Th2 cell differentiation, made possible by integrating Th2 specific bindings, time-series together with six public and six novel siRNA-mediated knock-down experiments. ChIP-seq showed significant overlap for all tested transcription factors. Next, we performed novel time-series measurements of total T-cells during differentiation towards Th2 and verified that our LASSIM model could monitor those data significantly better than comparable models that used the same Th2 bindings. In summary, the LASSIM toolbox opens the door to a new type of model-based data analysis that combines the strengths of reliable mechanistic models with truly systems-level data. We demonstrate the power of this approach by inferring a mechanistically motivated, genome-wide model of the Th2 transcription regulatory system, which plays an important role in several immune related diseases.
Swetapadma, Aleena; Yadav, Anamika
2015-01-01
Many schemes are reported for shunt fault location estimation, but fault location estimation of series or open conductor faults has not been dealt with so far. The existing numerical relays only detect the open conductor (series) fault and give the indication of the faulty phase(s), but they are unable to locate the series fault. The repair crew needs to patrol the complete line to find the location of series fault. In this paper fuzzy based fault detection/classification and location schemes in time domain are proposed for both series faults, shunt faults, and simultaneous series and shunt faults. The fault simulation studies and fault location algorithm have been developed using Matlab/Simulink. Synchronized phasors of voltage and current signals of both the ends of the line have been used as input to the proposed fuzzy based fault location scheme. Percentage of error in location of series fault is within 1% and shunt fault is 5% for all the tested fault cases. Validation of percentage of error in location estimation is done using Chi square test with both 1% and 5% level of significance. PMID:26413088
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marzouk, Youssef; Fast P.; Kraus, M.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that thesemore » data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.« less
Bayesian microsaccade detection
Mihali, Andra; van Opheusden, Bas; Ma, Wei Ji
2017-01-01
Microsaccades are high-velocity fixational eye movements, with special roles in perception and cognition. The default microsaccade detection method is to determine when the smoothed eye velocity exceeds a threshold. We have developed a new method, Bayesian microsaccade detection (BMD), which performs inference based on a simple statistical model of eye positions. In this model, a hidden state variable changes between drift and microsaccade states at random times. The eye position is a biased random walk with different velocity distributions for each state. BMD generates samples from the posterior probability distribution over the eye state time series given the eye position time series. Applied to simulated data, BMD recovers the “true” microsaccades with fewer errors than alternative algorithms, especially at high noise. Applied to EyeLink eye tracker data, BMD detects almost all the microsaccades detected by the default method, but also apparent microsaccades embedded in high noise—although these can also be interpreted as false positives. Next we apply the algorithms to data collected with a Dual Purkinje Image eye tracker, whose higher precision justifies defining the inferred microsaccades as ground truth. When we add artificial measurement noise, the inferences of all algorithms degrade; however, at noise levels comparable to EyeLink data, BMD recovers the “true” microsaccades with 54% fewer errors than the default algorithm. Though unsuitable for online detection, BMD has other advantages: It returns probabilities rather than binary judgments, and it can be straightforwardly adapted as the generative model is refined. We make our algorithm available as a software package. PMID:28114483
Hu, Yanzhu; Ai, Xinbo
2016-01-01
Complex network methodology is very useful for complex system explorer. However, the relationships among variables in complex system are usually not clear. Therefore, inferring association networks among variables from their observed data has been a popular research topic. We propose a synthetic method, named small-shuffle partial symbolic transfer entropy spectrum (SSPSTES), for inferring association network from multivariate time series. The method synthesizes surrogate data, partial symbolic transfer entropy (PSTE) and Granger causality. A proper threshold selection is crucial for common correlation identification methods and it is not easy for users. The proposed method can not only identify the strong correlation without selecting a threshold but also has the ability of correlation quantification, direction identification and temporal relation identification. The method can be divided into three layers, i.e. data layer, model layer and network layer. In the model layer, the method identifies all the possible pair-wise correlation. In the network layer, we introduce a filter algorithm to remove the indirect weak correlation and retain strong correlation. Finally, we build a weighted adjacency matrix, the value of each entry representing the correlation level between pair-wise variables, and then get the weighted directed association network. Two numerical simulated data from linear system and nonlinear system are illustrated to show the steps and performance of the proposed approach. The ability of the proposed method is approved by an application finally. PMID:27832153
NASA Astrophysics Data System (ADS)
Lohani, A. K.; Kumar, Rakesh; Singh, R. D.
2012-06-01
SummaryTime series modeling is necessary for the planning and management of reservoirs. More recently, the soft computing techniques have been used in hydrological modeling and forecasting. In this study, the potential of artificial neural networks and neuro-fuzzy system in monthly reservoir inflow forecasting are examined by developing and comparing monthly reservoir inflow prediction models, based on autoregressive (AR), artificial neural networks (ANNs) and adaptive neural-based fuzzy inference system (ANFIS). To take care the effect of monthly periodicity in the flow data, cyclic terms are also included in the ANN and ANFIS models. Working with time series flow data of the Sutlej River at Bhakra Dam, India, several ANN and adaptive neuro-fuzzy models are trained with different input vectors. To evaluate the performance of the selected ANN and adaptive neural fuzzy inference system (ANFIS) models, comparison is made with the autoregressive (AR) models. The ANFIS model trained with the input data vector including previous inflows and cyclic terms of monthly periodicity has shown a significant improvement in the forecast accuracy in comparison with the ANFIS models trained with the input vectors considering only previous inflows. In all cases ANFIS gives more accurate forecast than the AR and ANN models. The proposed ANFIS model coupled with the cyclic terms is shown to provide better representation of the monthly inflow forecasting for planning and operation of reservoir.
Baglietto, Gabriel; Gigante, Guido; Del Giudice, Paolo
2017-01-01
Two, partially interwoven, hot topics in the analysis and statistical modeling of neural data, are the development of efficient and informative representations of the time series derived from multiple neural recordings, and the extraction of information about the connectivity structure of the underlying neural network from the recorded neural activities. In the present paper we show that state-space clustering can provide an easy and effective option for reducing the dimensionality of multiple neural time series, that it can improve inference of synaptic couplings from neural activities, and that it can also allow the construction of a compact representation of the multi-dimensional dynamics, that easily lends itself to complexity measures. We apply a variant of the 'mean-shift' algorithm to perform state-space clustering, and validate it on an Hopfield network in the glassy phase, in which metastable states are largely uncorrelated from memories embedded in the synaptic matrix. In this context, we show that the neural states identified as clusters' centroids offer a parsimonious parametrization of the synaptic matrix, which allows a significant improvement in inferring the synaptic couplings from the neural activities. Moving to the more realistic case of a multi-modular spiking network, with spike-frequency adaptation inducing history-dependent effects, we propose a procedure inspired by Boltzmann learning, but extending its domain of application, to learn inter-module synaptic couplings so that the spiking network reproduces a prescribed pattern of spatial correlations; we then illustrate, in the spiking network, how clustering is effective in extracting relevant features of the network's state-space landscape. Finally, we show that the knowledge of the cluster structure allows casting the multi-dimensional neural dynamics in the form of a symbolic dynamics of transitions between clusters; as an illustration of the potential of such reduction, we define and analyze a measure of complexity of the neural time series.
Inference of quantitative models of bacterial promoters from time-series reporter gene data.
Stefan, Diana; Pinel, Corinne; Pinhal, Stéphane; Cinquemani, Eugenio; Geiselmann, Johannes; de Jong, Hidde
2015-01-01
The inference of regulatory interactions and quantitative models of gene regulation from time-series transcriptomics data has been extensively studied and applied to a range of problems in drug discovery, cancer research, and biotechnology. The application of existing methods is commonly based on implicit assumptions on the biological processes under study. First, the measurements of mRNA abundance obtained in transcriptomics experiments are taken to be representative of protein concentrations. Second, the observed changes in gene expression are assumed to be solely due to transcription factors and other specific regulators, while changes in the activity of the gene expression machinery and other global physiological effects are neglected. While convenient in practice, these assumptions are often not valid and bias the reverse engineering process. Here we systematically investigate, using a combination of models and experiments, the importance of this bias and possible corrections. We measure in real time and in vivo the activity of genes involved in the FliA-FlgM module of the E. coli motility network. From these data, we estimate protein concentrations and global physiological effects by means of kinetic models of gene expression. Our results indicate that correcting for the bias of commonly-made assumptions improves the quality of the models inferred from the data. Moreover, we show by simulation that these improvements are expected to be even stronger for systems in which protein concentrations have longer half-lives and the activity of the gene expression machinery varies more strongly across conditions than in the FliA-FlgM module. The approach proposed in this study is broadly applicable when using time-series transcriptome data to learn about the structure and dynamics of regulatory networks. In the case of the FliA-FlgM module, our results demonstrate the importance of global physiological effects and the active regulation of FliA and FlgM half-lives for the dynamics of FliA-dependent promoters.
Inversion of residual stress profiles from ultrasonic Rayleigh wave dispersion data
NASA Astrophysics Data System (ADS)
Mora, P.; Spies, M.
2018-05-01
We investigate theoretically and with synthetic data the performance of several inversion methods to infer a residual stress state from ultrasonic surface wave dispersion data. We show that this particular problem may reveal in relevant materials undesired behaviors for some methods that could be reliably applied to infer other properties. We focus on two methods, one based on a Taylor-expansion, and another one based on a piecewise linear expansion regularized by a singular value decomposition. We explain the instabilities of the Taylor-based method by highlighting singularities in the series of coefficients. At the same time, we show that the other method can successfully provide performances which only weakly depend on the material.
A continuous analog of run length distributions reflecting accumulated fractionation events.
Yu, Zhe; Sankoff, David
2016-11-11
We propose a new, continuous model of the fractionation process (duplicate gene deletion after polyploidization) on the real line. The aim is to infer how much DNA is deleted at a time, based on segment lengths for alternating deleted (invisible) and undeleted (visible) regions. After deriving a number of analytical results for "one-sided" fractionation, we undertake a series of simulations that help us identify the distribution of segment lengths as a gamma with shape and rate parameters evolving over time. This leads to an inference procedure based on observed length distributions for visible and invisible segments. We suggest extensions of this mathematical and simulation work to biologically realistic discrete models, including two-sided fractionation.
Mineral and Geochemical Classification From Spectroscopy/Diffraction Through Neural Networks
NASA Astrophysics Data System (ADS)
Ferralis, N.; Grossman, J.; Summons, R. E.
2017-12-01
Spectroscopy and diffraction techniques are essential for understanding structural, chemical and functional properties of geological materials for Earth and Planetary Sciences. Beyond data collection, quantitative insight relies on experimentally assembled, or computationally derived spectra. Inference on the geochemical or geophysical properties (such as crystallographic order, chemical functionality, elemental composition, etc.) of a particular geological material (mineral, organic matter, etc.) is based on fitting unknown spectra and comparing the fit with consolidated databases. The complexity of fitting highly convoluted spectra, often limits the ability to infer geochemical characteristics, and limits the throughput for extensive datasets. With the emergence of heuristic approaches to pattern recognitions though machine learning, in this work we investigate the possibility and potential of using supervised neural networks trained on available public spectroscopic database to directly infer geochemical parameters from unknown spectra. Using Raman, infrared spectroscopy and powder x-ray diffraction from the publicly available RRUFF database, we train neural network models to classify mineral and organic compounds (pure or mixtures) based on crystallographic structure from diffraction, chemical functionality, elemental composition and bonding from spectroscopy. As expected, the accuracy of the inference is strongly dependent on the quality and extent of the training data. We will identify a series of requirements and guidelines for the training dataset needed to achieve consistent high accuracy inference, along with methods to compensate for limited of data.
Predicting disease progression from short biomarker series using expert advice algorithm
NASA Astrophysics Data System (ADS)
Morino, Kai; Hirata, Yoshito; Tomioka, Ryota; Kashima, Hisashi; Yamanishi, Kenji; Hayashi, Norihiro; Egawa, Shin; Aihara, Kazuyuki
2015-05-01
Well-trained clinicians may be able to provide diagnosis and prognosis from very short biomarker series using information and experience gained from previous patients. Although mathematical methods can potentially help clinicians to predict the progression of diseases, there is no method so far that estimates the patient state from very short time-series of a biomarker for making diagnosis and/or prognosis by employing the information of previous patients. Here, we propose a mathematical framework for integrating other patients' datasets to infer and predict the state of the disease in the current patient based on their short history. We extend a machine-learning framework of ``prediction with expert advice'' to deal with unstable dynamics. We construct this mathematical framework by combining expert advice with a mathematical model of prostate cancer. Our model predicted well the individual biomarker series of patients with prostate cancer that are used as clinical samples.
NASA Astrophysics Data System (ADS)
Yuan, Y.; Meng, Y.; Chen, Y. X.; Jiang, C.; Yue, A. Z.
2018-04-01
In this study, we proposed a method to map urban encroachment onto farmland using satellite image time series (SITS) based on the hierarchical hidden Markov model (HHMM). In this method, the farmland change process is decomposed into three hierarchical levels, i.e., the land cover level, the vegetation phenology level, and the SITS level. Then a three-level HHMM is constructed to model the multi-level semantic structure of farmland change process. Once the HHMM is established, a change from farmland to built-up could be detected by inferring the underlying state sequence that is most likely to generate the input time series. The performance of the method is evaluated on MODIS time series in Beijing. Results on both simulated and real datasets demonstrate that our method improves the change detection accuracy compared with the HMM-based method.
Predicting disease progression from short biomarker series using expert advice algorithm.
Morino, Kai; Hirata, Yoshito; Tomioka, Ryota; Kashima, Hisashi; Yamanishi, Kenji; Hayashi, Norihiro; Egawa, Shin; Aihara, Kazuyuki
2015-05-20
Well-trained clinicians may be able to provide diagnosis and prognosis from very short biomarker series using information and experience gained from previous patients. Although mathematical methods can potentially help clinicians to predict the progression of diseases, there is no method so far that estimates the patient state from very short time-series of a biomarker for making diagnosis and/or prognosis by employing the information of previous patients. Here, we propose a mathematical framework for integrating other patients' datasets to infer and predict the state of the disease in the current patient based on their short history. We extend a machine-learning framework of "prediction with expert advice" to deal with unstable dynamics. We construct this mathematical framework by combining expert advice with a mathematical model of prostate cancer. Our model predicted well the individual biomarker series of patients with prostate cancer that are used as clinical samples.
Time-dependent limited penetrable visibility graph analysis of nonstationary time series
NASA Astrophysics Data System (ADS)
Gao, Zhong-Ke; Cai, Qing; Yang, Yu-Xuan; Dang, Wei-Dong
2017-06-01
Recent years have witnessed the development of visibility graph theory, which allows us to analyze a time series from the perspective of complex network. We in this paper develop a novel time-dependent limited penetrable visibility graph (TDLPVG). Two examples using nonstationary time series from RR intervals and gas-liquid flows are provided to demonstrate the effectiveness of our approach. The results of the first example suggest that our TDLPVG method allows characterizing the time-varying behaviors and classifying heart states of healthy, congestive heart failure and atrial fibrillation from RR interval time series. For the second example, we infer TDLPVGs from gas-liquid flow signals and interestingly find that the deviation of node degree of TDLPVGs enables to effectively uncover the time-varying dynamical flow behaviors of gas-liquid slug and bubble flow patterns. All these results render our TDLPVG method particularly powerful for characterizing the time-varying features underlying realistic complex systems from time series.
How Preschoolers Use Cues of Dominance to Make Sense of Their Social Environment
ERIC Educational Resources Information Center
Charafeddine, Rawan; Mercier, Hugo; Clément, Fabrice; Kaufmann, Laurence; Berchtold, André; Reboul, Anne; Van der Henst, Jean-Baptiste
2015-01-01
A series of four experiments investigated preschoolers' abilities to make sense of dominance relations. Experiments 1 and 2 showed that as early as 3 years old, preschoolers are able to infer dominance not only from physical supremacy but also from decision power, age, and resources. Experiments 3 and 4 showed that preschoolers have expectations…
ERIC Educational Resources Information Center
Haertel, Edward H.
2013-01-01
Policymakers and school administrators have embraced value-added models of teacher effectiveness as tools for educational improvement. Teacher value-added estimates may be viewed as complicated scores of a certain kind. This suggests using a test validation model to examine their reliability and validity. Validation begins with an interpretive…
Applications of statistics to medical science (1) Fundamental concepts.
Watanabe, Hiroshi
2011-01-01
The conceptual framework of statistical tests and statistical inferences are discussed, and the epidemiological background of statistics is briefly reviewed. This study is one of a series in which we survey the basics of statistics and practical methods used in medical statistics. Arguments related to actual statistical analysis procedures will be made in subsequent papers.
Keeping Up with the Joneses. A Soap Opera for Adult ESL Students.
ERIC Educational Resources Information Center
Phillips, Chiquita
A series of high-interest, low English-language-learning-level stories developed for adult students of English as a second language are combined as a soap opera for classroom use. An introductory section outlines techniques for presentation of the texts on tape and in written form and for exercises in listening, making inferences, reading, and…
Reliable inference of light curve parameters in the presence of systematics
NASA Astrophysics Data System (ADS)
Gibson, Neale P.
2016-10-01
Time-series photometry and spectroscopy of transiting exoplanets allow us to study their atmospheres. Unfortunately, the required precision to extract atmospheric information surpasses the design specifications of most general purpose instrumentation. This results in instrumental systematics in the light curves that are typically larger than the target precision. Systematics must therefore be modelled, leaving the inference of light-curve parameters conditioned on the subjective choice of systematics models and model-selection criteria. Here, I briefly review the use of systematics models commonly used for transmission and emission spectroscopy, including model selection, marginalisation over models, and stochastic processes. These form a hierarchy of models with increasing degree of objectivity. I argue that marginalisation over many systematics models is a minimal requirement for robust inference. Stochastic models provide even more flexibility and objectivity, and therefore produce the most reliable results. However, no systematics models are perfect, and the best strategy is to compare multiple methods and repeat observations where possible.
Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust.
Cun, Yupeng; Yang, Tsun-Po; Achter, Viktor; Lang, Ulrich; Peifer, Martin
2018-06-01
The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking <10 min. Sclust is designed such that even non-experts in computational biology or bioinformatics with basic knowledge of the Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.
Inferring multi-scale neural mechanisms with brain network modelling
Schirner, Michael; McIntosh, Anthony Randal; Jirsa, Viktor; Deco, Gustavo
2018-01-01
The neurophysiological processes underlying non-invasive brain activity measurements are incompletely understood. Here, we developed a connectome-based brain network model that integrates individual structural and functional data with neural population dynamics to support multi-scale neurophysiological inference. Simulated populations were linked by structural connectivity and, as a novelty, driven by electroencephalography (EEG) source activity. Simulations not only predicted subjects' individual resting-state functional magnetic resonance imaging (fMRI) time series and spatial network topologies over 20 minutes of activity, but more importantly, they also revealed precise neurophysiological mechanisms that underlie and link six empirical observations from different scales and modalities: (1) resting-state fMRI oscillations, (2) functional connectivity networks, (3) excitation-inhibition balance, (4, 5) inverse relationships between α-rhythms, spike-firing and fMRI on short and long time scales, and (6) fMRI power-law scaling. These findings underscore the potential of this new modelling framework for general inference and integration of neurophysiological knowledge to complement empirical studies. PMID:29308767
The link between social cognition and self-referential thought in the medial prefrontal cortex.
Mitchell, Jason P; Banaji, Mahzarin R; Macrae, C Neil
2005-08-01
The medial prefrontal cortex (mPFC) has been implicated in seemingly disparate cognitive functions, such as understanding the minds of other people and processing information about the self. This functional overlap would be expected if humans use their own experiences to infer the mental states of others, a basic postulate of simulation theory. Neural activity was measured while participants attended to either the mental or physical aspects of a series of other people. To permit a test of simulation theory's prediction that inferences based on self-reflection should only be made for similar others, targets were subsequently rated for their degree of similarity to self. Parametric analyses revealed a region of the ventral mPFC--previously implicated in self-referencing tasks--in which activity correlated with perceived self/other similarity, but only for mentalizing trials. These results suggest that self-reflection may be used to infer the mental states of others when they are sufficiently similar to self.
Duration of serum antibody response to rabies vaccination in horses.
Harvey, Alison M; Watson, Johanna L; Brault, Stephanie A; Edman, Judy M; Moore, Susan M; Kass, Philip H; Wilson, W David
2016-08-15
OBJECTIVE To investigate the impact of age and inferred prior vaccination history on the persistence of vaccine-induced antibody against rabies in horses. DESIGN Serologic response evaluation. ANIMALS 48 horses with an undocumented vaccination history. PROCEDURES Horses were vaccinated against rabies once. Blood samples were collected prior to vaccination, 3 to 7 weeks after vaccination, and at 6-month intervals for 2 to 3 years. Serum rabies virus-neutralizing antibody (RVNA) values were measured. An RVNA value of ≥ 0.5 U/mL was used to define a predicted protective immune response on the basis of World Health Organization recommendations for humans. Values were compared between horses < 20 and ≥ 20 years of age and between horses inferred to have been previously vaccinated and those inferred to be immunologically naïve. RESULTS A protective RVNA value (≥ 0.5 U/mL) was maintained for 2 to 3 years in horses inferred to have been previously vaccinated on the basis of prevaccination RVNA values. No significant difference was evident in response to rabies vaccination or duration of protective RVNA values between horses < 20 and ≥ 20 years of age. Seven horses were poor responders to vaccination. Significant differences were identified between horses inferred to have been previously vaccinated and horses inferred to be naïve prior to the study. CONCLUSIONS AND CLINICAL RELEVANCE A rabies vaccination interval > 1 year may be appropriate for previously vaccinated horses but not for horses vaccinated only once. Additional research is required to confirm this finding and characterize the optimal primary dose series for rabies vaccination.
Statistical inference methods for sparse biological time series data.
Ndukum, Juliet; Fonseca, Luís L; Santos, Helena; Voit, Eberhard O; Datta, Susmita
2011-04-25
Comparing metabolic profiles under different biological perturbations has become a powerful approach to investigating the functioning of cells. The profiles can be taken as single snapshots of a system, but more information is gained if they are measured longitudinally over time. The results are short time series consisting of relatively sparse data that cannot be analyzed effectively with standard time series techniques, such as autocorrelation and frequency domain methods. In this work, we study longitudinal time series profiles of glucose consumption in the yeast Saccharomyces cerevisiae under different temperatures and preconditioning regimens, which we obtained with methods of in vivo nuclear magnetic resonance (NMR) spectroscopy. For the statistical analysis we first fit several nonlinear mixed effect regression models to the longitudinal profiles and then used an ANOVA likelihood ratio method in order to test for significant differences between the profiles. The proposed methods are capable of distinguishing metabolic time trends resulting from different treatments and associate significance levels to these differences. Among several nonlinear mixed-effects regression models tested, a three-parameter logistic function represents the data with highest accuracy. ANOVA and likelihood ratio tests suggest that there are significant differences between the glucose consumption rate profiles for cells that had been--or had not been--preconditioned by heat during growth. Furthermore, pair-wise t-tests reveal significant differences in the longitudinal profiles for glucose consumption rates between optimal conditions and heat stress, optimal and recovery conditions, and heat stress and recovery conditions (p-values <0.0001). We have developed a nonlinear mixed effects model that is appropriate for the analysis of sparse metabolic and physiological time profiles. The model permits sound statistical inference procedures, based on ANOVA likelihood ratio tests, for testing the significance of differences between short time course data under different biological perturbations.
de Vocht, Frank
2016-12-01
Mobile phone use has been increasing rapidly in the past decades and, in parallel, so has the annual incidence of certain types of brain cancers. However, it remains unclear whether this correlation is coincidental or whether use of mobile phones may cause the development, promotion or progression of specific cancers. The 1985-2014 incidence of selected brain cancer subtypes in England were analyzed and compared to counterfactual 'synthetic control' timeseries. Annual 1985-2014 incidence of malignant glioma, glioblastoma multiforme, and malignant neoplasms of the temporal and parietal lobes in England were modelled based on population-level covariates using Bayesian structural time series models assuming 5,10 and 15year minimal latency periods. Post-latency counterfactual 'synthetic England' timeseries were nowcast based on covariate trends. The impact of mobile phone use was inferred from differences between measured and modelled time series. There is no evidence of an increase in malignant glioma, glioblastoma multiforme, or malignant neoplasms of the parietal lobe not predicted in the 'synthetic England' time series. Malignant neoplasms of the temporal lobe however, have increased faster than expected. A latency period of 10years reflected the earliest latency period when this was measurable and related to mobile phone penetration rates, and indicated an additional increase of 35% (95% Credible Interval 9%:59%) during 2005-2014; corresponding to an additional 188 (95%CI 48-324) cases annually. A causal factor, of which mobile phone use (and possibly other wireless equipment) is in agreement with the hypothesized temporal association, is related to an increased risk of developing malignant neoplasms in the temporal lobe. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Jackson, Brian; Lorenz, Ralph; Davis, Karan
2018-01-01
Dust devils are likely the dominant source of dust for the martian atmosphere, but the amount and frequency of dust-lifting depend on the statistical distribution of dust devil parameters. Dust devils exhibit pressure perturbations and, if they pass near a barometric sensor, they may register as a discernible dip in a pressure time-series. Leveraging this fact, several surveys using barometric sensors on landed spacecraft have revealed dust devil structures and occurrence rates. However powerful they are, though, such surveys suffer from non-trivial biases that skew the inferred dust devil properties. For example, such surveys are most sensitive to dust devils with the widest and deepest pressure profiles, but the recovered profiles will be distorted, broader and shallow than the actual profiles. In addition, such surveys often do not provide wind speed measurements alongside the pressure time series, and so the durations of the dust devil signals in the time series cannot be directly converted to profile widths. Fortunately, simple statistical and geometric considerations can de-bias these surveys, allowing conversion of the duration of dust devil signals into physical widths, given only a distribution of likely translation velocities, and the recovery of the underlying distributions of physical parameters. In this study, we develop a scheme for de-biasing such surveys. Applying our model to an in-situ survey using data from the Phoenix lander suggests a larger dust flux and a dust devil occurrence rate about ten times larger than previously inferred. Comparing our results to dust devil track surveys suggests only about one in five low-pressure cells lifts sufficient dust to leave a visible track.
NASA Astrophysics Data System (ADS)
Fatichi, S.; Ivanov, V. Y.; Caporali, E.
2013-04-01
This study extends a stochastic downscaling methodology to generation of an ensemble of hourly time series of meteorological variables that express possible future climate conditions at a point-scale. The stochastic downscaling uses general circulation model (GCM) realizations and an hourly weather generator, the Advanced WEather GENerator (AWE-GEN). Marginal distributions of factors of change are computed for several climate statistics using a Bayesian methodology that can weight GCM realizations based on the model relative performance with respect to a historical climate and a degree of disagreement in projecting future conditions. A Monte Carlo technique is used to sample the factors of change from their respective marginal distributions. As a comparison with traditional approaches, factors of change are also estimated by averaging GCM realizations. With either approach, the derived factors of change are applied to the climate statistics inferred from historical observations to re-evaluate parameters of the weather generator. The re-parameterized generator yields hourly time series of meteorological variables that can be considered to be representative of future climate conditions. In this study, the time series are generated in an ensemble mode to fully reflect the uncertainty of GCM projections, climate stochasticity, as well as uncertainties of the downscaling procedure. Applications of the methodology in reproducing future climate conditions for the periods of 2000-2009, 2046-2065 and 2081-2100, using the period of 1962-1992 as the historical baseline are discussed for the location of Firenze (Italy). The inferences of the methodology for the period of 2000-2009 are tested against observations to assess reliability of the stochastic downscaling procedure in reproducing statistics of meteorological variables at different time scales.
tsiR: An R package for time-series Susceptible-Infected-Recovered models of epidemics.
Becker, Alexander D; Grenfell, Bryan T
2017-01-01
tsiR is an open source software package implemented in the R programming language designed to analyze infectious disease time-series data. The software extends a well-studied and widely-applied algorithm, the time-series Susceptible-Infected-Recovered (TSIR) model, to infer parameters from incidence data, such as contact seasonality, and to forward simulate the underlying mechanistic model. The tsiR package aggregates a number of different fitting features previously described in the literature in a user-friendly way, providing support for their broader adoption in infectious disease research. Also included in tsiR are a number of diagnostic tools to assess the fit of the TSIR model. This package should be useful for researchers analyzing incidence data for fully-immunizing infectious diseases.
NASA Astrophysics Data System (ADS)
Meredith, Michael P.; Nicholls, Keith W.; Renfrew, Ian A.; Boehme, Lars; Biuw, Martin; Fedak, Mike
2011-07-01
A serendipitous >8-month time series of hydrographic properties was obtained from the vicinity of the South Orkney Islands, Southern Ocean, by tagging a southern elephant seal ( Mirounga leonina) on Signy Island with a Conductivity-Temperature-Depth/Satellite-Relay Data Logger (CTD-SRDL) in March 2007. Such a time series (including data from the austral autumn and winter) would have been extremely difficult to obtain via other means, and it illustrates with unprecedented temporal resolution the seasonal progression of upper-ocean water mass properties and stratification at this location. Sea ice production values of around 0.15-0.4 m month -1 for April to July were inferred from the progression of salinity, with significant levels still in September (around 0.2 m month -1). However, these values presume that advective processes have negligible effect on the salinity changes observed locally; this presumption is seen to be inappropriate in this case, and it is argued that the ice production rates inferred are better considered as "smeared averages" for the region of the northwestern Weddell Sea upstream from the South Orkneys. The impact of such advective effects is illustrated by contrasting the observed hydrographic series with the output of a one-dimensional model of the upper-ocean forced with local fluxes. It is found that the difference in magnitude between local (modelled) and regional (inferred) ice production is significant, with estimates differing by around a factor of two. A halo of markedly low sea ice concentration around the South Orkneys during the austral winter offers at least a partial explanation for this, since it enabled stronger atmosphere/ocean fluxes to persist and hence stronger ice production to prevail locally compared with the upstream region. The year of data collection was an El Niño year, and it is well-established that this phenomenon can impact strongly on the surface ocean and ice field in this sector of the Southern Ocean, thus the possibility of our time series being atypical cannot be ruled out. Longer-term collection of in situ ocean data from this locality would be desirable, to address issues relating to interannual variability and long-term change.
An algebra-based method for inferring gene regulatory networks
2014-01-01
Background The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. Results This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the dynamic patterns present in the network. Conclusions Boolean polynomial dynamical systems provide a powerful modeling framework for the reverse engineering of gene regulatory networks, that enables a rich mathematical structure on the model search space. A C++ implementation of the method, distributed under LPGL license, is available, together with the source code, at http://www.paola-vera-licona.net/Software/EARevEng/REACT.html. PMID:24669835
Szöllősi, Gergely J.; Boussau, Bastien; Abby, Sophie S.; Tannier, Eric; Daubin, Vincent
2012-01-01
The timing of the evolution of microbial life has largely remained elusive due to the scarcity of prokaryotic fossil record and the confounding effects of the exchange of genes among possibly distant species. The history of gene transfer events, however, is not a series of individual oddities; it records which lineages were concurrent and thus provides information on the timing of species diversification. Here, we use a probabilistic model of genome evolution that accounts for differences between gene phylogenies and the species tree as series of duplication, transfer, and loss events to reconstruct chronologically ordered species phylogenies. Using simulations we show that we can robustly recover accurate chronologically ordered species phylogenies in the presence of gene tree reconstruction errors and realistic rates of duplication, transfer, and loss. Using genomic data we demonstrate that we can infer rooted species phylogenies using homologous gene families from complete genomes of 10 bacterial and archaeal groups. Focusing on cyanobacteria, distinguished among prokaryotes by a relative abundance of fossils, we infer the maximum likelihood chronologically ordered species phylogeny based on 36 genomes with 8,332 homologous gene families. We find the order of speciation events to be in full agreement with the fossil record and the inferred phylogeny of cyanobacteria to be consistent with the phylogeny recovered from established phylogenomics methods. Our results demonstrate that lateral gene transfers, detected by probabilistic models of genome evolution, can be used as a source of information on the timing of evolution, providing a valuable complement to the limited prokaryotic fossil record. PMID:23043116
Inference on periodicity of circadian time series.
Costa, Maria J; Finkenstädt, Bärbel; Roche, Véronique; Lévi, Francis; Gould, Peter D; Foreman, Julia; Halliday, Karen; Hall, Anthony; Rand, David A
2013-09-01
Estimation of the period length of time-course data from cyclical biological processes, such as those driven by the circadian pacemaker, is crucial for inferring the properties of the biological clock found in many living organisms. We propose a methodology for period estimation based on spectrum resampling (SR) techniques. Simulation studies show that SR is superior and more robust to non-sinusoidal and noisy cycles than a currently used routine based on Fourier approximations. In addition, a simple fit to the oscillations using linear least squares is available, together with a non-parametric test for detecting changes in period length which allows for period estimates with different variances, as frequently encountered in practice. The proposed methods are motivated by and applied to various data examples from chronobiology.
NASA Astrophysics Data System (ADS)
Mohammad-Djafari, Ali
2015-01-01
The main object of this tutorial article is first to review the main inference tools using Bayesian approach, Entropy, Information theory and their corresponding geometries. This review is focused mainly on the ways these tools have been used in data, signal and image processing. After a short introduction of the different quantities related to the Bayes rule, the entropy and the Maximum Entropy Principle (MEP), relative entropy and the Kullback-Leibler divergence, Fisher information, we will study their use in different fields of data and signal processing such as: entropy in source separation, Fisher information in model order selection, different Maximum Entropy based methods in time series spectral estimation and finally, general linear inverse problems.
Li, Shi; Mukherjee, Bhramar; Batterman, Stuart; Ghosh, Malay
2013-12-01
Case-crossover designs are widely used to study short-term exposure effects on the risk of acute adverse health events. While the frequentist literature on this topic is vast, there is no Bayesian work in this general area. The contribution of this paper is twofold. First, the paper establishes Bayesian equivalence results that require characterization of the set of priors under which the posterior distributions of the risk ratio parameters based on a case-crossover and time-series analysis are identical. Second, the paper studies inferential issues under case-crossover designs in a Bayesian framework. Traditionally, a conditional logistic regression is used for inference on risk-ratio parameters in case-crossover studies. We consider instead a more general full likelihood-based approach which makes less restrictive assumptions on the risk functions. Formulation of a full likelihood leads to growth in the number of parameters proportional to the sample size. We propose a semi-parametric Bayesian approach using a Dirichlet process prior to handle the random nuisance parameters that appear in a full likelihood formulation. We carry out a simulation study to compare the Bayesian methods based on full and conditional likelihood with the standard frequentist approaches for case-crossover and time-series analysis. The proposed methods are illustrated through the Detroit Asthma Morbidity, Air Quality and Traffic study, which examines the association between acute asthma risk and ambient air pollutant concentrations. © 2013, The International Biometric Society.
ERIC Educational Resources Information Center
Cheng, Kun-Hung; Tsai, Chin-Chung
2016-01-01
Following a previous study (Cheng & Tsai, 2014. "Computers & Education"), this study aimed to probe the interaction of child-parent shared reading with the augmented reality (AR) picture book in more depth. A series of sequential analyses were thus conducted to infer the behavioral transition diagrams and visualize the continuity…
Final Report for Dynamic Models for Causal Analysis of Panel Data. Preface.
ERIC Educational Resources Information Center
Hannan, Michael T.; Tuma, Nancy Brandon
This document introduces research aimed to explore methods that could be used to make inferences about causual effects of educational change over time when data are from an educational panel. This preface, the first in a series of 14 chapters described in SO 011 760-772, discusses an educational research project designed to examine affects of…
Designing and Validating a Measure of Teacher Knowledge of Universal Design for Assessment (UDA)
ERIC Educational Resources Information Center
Jamgochian, Elisa Megan
2010-01-01
The primary purpose of this study was to design and validate a measure of teacher knowledge of Universal Design for Assessment (TK-UDA). Guided by a validity framework, a number of inferences, assumptions, and evidences supported this investigation. By addressing a series of research questions, evidence was garnered for the use of the measure to…
ERIC Educational Resources Information Center
Snyder, Patricia A.; Thompson, Bruce
The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association…
Fuzzy/Neural Software Estimates Costs of Rocket-Engine Tests
NASA Technical Reports Server (NTRS)
Douglas, Freddie; Bourgeois, Edit Kaminsky
2005-01-01
The Highly Accurate Cost Estimating Model (HACEM) is a software system for estimating the costs of testing rocket engines and components at Stennis Space Center. HACEM is built on a foundation of adaptive-network-based fuzzy inference systems (ANFIS) a hybrid software concept that combines the adaptive capabilities of neural networks with the ease of development and additional benefits of fuzzy-logic-based systems. In ANFIS, fuzzy inference systems are trained by use of neural networks. HACEM includes selectable subsystems that utilize various numbers and types of inputs, various numbers of fuzzy membership functions, and various input-preprocessing techniques. The inputs to HACEM are parameters of specific tests or series of tests. These parameters include test type (component or engine test), number and duration of tests, and thrust level(s) (in the case of engine tests). The ANFIS in HACEM are trained by use of sets of these parameters, along with costs of past tests. Thereafter, the user feeds HACEM a simple input text file that contains the parameters of a planned test or series of tests, the user selects the desired HACEM subsystem, and the subsystem processes the parameters into an estimate of cost(s).
STELLAR DYNAMOS AND CYCLES FROM NUMERICAL SIMULATIONS OF CONVECTION
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dubé, Caroline; Charbonneau, Paul, E-mail: dube@astro.umontreal.ca, E-mail: paulchar@astro.umontreal.ca
We present a series of kinematic axisymmetric mean-field αΩ dynamo models applicable to solar-type stars, for 20 distinct combinations of rotation rates and luminosities. The internal differential rotation and kinetic helicity profiles required to calculate source terms in these dynamo models are extracted from a corresponding series of global three-dimensional hydrodynamical simulations of solar/stellar convection, so that the resulting dynamo models end up involving only one free parameter, namely, the turbulent magnetic diffusivity in the convecting layers. Even though the αΩ dynamo solutions exhibit a broad range of morphologies, and sometimes even double cycles, these models manage to reproduce relativelymore » well the observationally inferred relationship between cycle period and rotation rate. On the other hand, they fail in capturing the observed increase of magnetic activity levels with rotation rate. This failure is due to our use of a simple algebraic α-quenching formula as the sole amplitude-limiting nonlinearity. This suggests that α-quenching is not the primary mechanism setting the amplitude of stellar magnetic cycles, with magnetic reaction on large-scale flows emerging as the more likely candidate. This inference is coherent with analyses of various recent global magnetohydrodynamical simulations of solar/stellar convection.« less
Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models.
Daunizeau, J; Friston, K J; Kiebel, S J
2009-11-01
In this paper, we describe a general variational Bayesian approach for approximate inference on nonlinear stochastic dynamic models. This scheme extends established approximate inference on hidden-states to cover: (i) nonlinear evolution and observation functions, (ii) unknown parameters and (precision) hyperparameters and (iii) model comparison and prediction under uncertainty. Model identification or inversion entails the estimation of the marginal likelihood or evidence of a model. This difficult integration problem can be finessed by optimising a free-energy bound on the evidence using results from variational calculus. This yields a deterministic update scheme that optimises an approximation to the posterior density on the unknown model variables. We derive such a variational Bayesian scheme in the context of nonlinear stochastic dynamic hierarchical models, for both model identification and time-series prediction. The computational complexity of the scheme is comparable to that of an extended Kalman filter, which is critical when inverting high dimensional models or long time-series. Using Monte-Carlo simulations, we assess the estimation efficiency of this variational Bayesian approach using three stochastic variants of chaotic dynamic systems. We also demonstrate the model comparison capabilities of the method, its self-consistency and its predictive power.
Kernel methods and flexible inference for complex stochastic dynamics
NASA Astrophysics Data System (ADS)
Capobianco, Enrico
2008-07-01
Approximation theory suggests that series expansions and projections represent standard tools for random process applications from both numerical and statistical standpoints. Such instruments emphasize the role of both sparsity and smoothness for compression purposes, the decorrelation power achieved in the expansion coefficients space compared to the signal space, and the reproducing kernel property when some special conditions are met. We consider these three aspects central to the discussion in this paper, and attempt to analyze the characteristics of some known approximation instruments employed in a complex application domain such as financial market time series. Volatility models are often built ad hoc, parametrically and through very sophisticated methodologies. But they can hardly deal with stochastic processes with regard to non-Gaussianity, covariance non-stationarity or complex dependence without paying a big price in terms of either model mis-specification or computational efficiency. It is thus a good idea to look at other more flexible inference tools; hence the strategy of combining greedy approximation and space dimensionality reduction techniques, which are less dependent on distributional assumptions and more targeted to achieve computationally efficient performances. Advantages and limitations of their use will be evaluated by looking at algorithmic and model building strategies, and by reporting statistical diagnostics.
Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar
2017-08-01
Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.
Functional neuroanatomy of intuitive physical inference
Mikhael, John G.; Tenenbaum, Joshua B.; Kanwisher, Nancy
2016-01-01
To engage with the world—to understand the scene in front of us, plan actions, and predict what will happen next—we must have an intuitive grasp of the world’s physical structure and dynamics. How do the objects in front of us rest on and support each other, how much force would be required to move them, and how will they behave when they fall, roll, or collide? Despite the centrality of physical inferences in daily life, little is known about the brain mechanisms recruited to interpret the physical structure of a scene and predict how physical events will unfold. Here, in a series of fMRI experiments, we identified a set of cortical regions that are selectively engaged when people watch and predict the unfolding of physical events—a “physics engine” in the brain. These brain regions are selective to physical inferences relative to nonphysical but otherwise highly similar scenes and tasks. However, these regions are not exclusively engaged in physical inferences per se or, indeed, even in scene understanding; they overlap with the domain-general “multiple demand” system, especially the parts of that system involved in action planning and tool use, pointing to a close relationship between the cognitive and neural mechanisms involved in parsing the physical content of a scene and preparing an appropriate action. PMID:27503892
The challenges to inferring the regulators of biodiversity in deep time.
Ezard, Thomas H G; Quental, Tiago B; Benton, Michael J
2016-04-05
Attempts to infer the ecological drivers of macroevolution in deep time have long drawn inspiration from work on extant systems, but long-term evolutionary and geological changes complicate the simple extrapolation of such theory. Recent efforts to incorporate a more informed ecology into macroevolution have moved beyond the descriptive, seeking to isolate generating mechanisms and produce testable hypotheses of how groups of organisms usurp each other or coexist over vast timespans. This theme issue aims to exemplify this progress, providing a series of case studies of how novel modelling approaches are helping infer the regulators of biodiversity in deep time. In this Introduction, we explore the challenges of these new approaches. First, we discuss how our choices of taxonomic units have implications for the conclusions drawn. Second, we emphasize the need to embrace the interdependence of biotic and abiotic changes, because no living organism ignores its environment. Third, in the light of parts 1 and 2, we discuss the set of dynamic signatures that we might expect to observe in the fossil record. Finally, we ask whether these dynamics represent the most ecologically informative foci for research efforts aimed at inferring the regulators of biodiversity in deep time. The papers in this theme issue contribute in each of these areas. © 2016 The Author(s).
Functional neuroanatomy of intuitive physical inference.
Fischer, Jason; Mikhael, John G; Tenenbaum, Joshua B; Kanwisher, Nancy
2016-08-23
To engage with the world-to understand the scene in front of us, plan actions, and predict what will happen next-we must have an intuitive grasp of the world's physical structure and dynamics. How do the objects in front of us rest on and support each other, how much force would be required to move them, and how will they behave when they fall, roll, or collide? Despite the centrality of physical inferences in daily life, little is known about the brain mechanisms recruited to interpret the physical structure of a scene and predict how physical events will unfold. Here, in a series of fMRI experiments, we identified a set of cortical regions that are selectively engaged when people watch and predict the unfolding of physical events-a "physics engine" in the brain. These brain regions are selective to physical inferences relative to nonphysical but otherwise highly similar scenes and tasks. However, these regions are not exclusively engaged in physical inferences per se or, indeed, even in scene understanding; they overlap with the domain-general "multiple demand" system, especially the parts of that system involved in action planning and tool use, pointing to a close relationship between the cognitive and neural mechanisms involved in parsing the physical content of a scene and preparing an appropriate action.
Inferring animal social networks and leadership: applications for passive monitoring arrays.
Jacoby, David M P; Papastamatiou, Yannis P; Freeman, Robin
2016-11-01
Analyses of animal social networks have frequently benefited from techniques derived from other disciplines. Recently, machine learning algorithms have been adopted to infer social associations from time-series data gathered using remote, telemetry systems situated at provisioning sites. We adapt and modify existing inference methods to reveal the underlying social structure of wide-ranging marine predators moving through spatial arrays of passive acoustic receivers. From six months of tracking data for grey reef sharks (Carcharhinus amblyrhynchos) at Palmyra atoll in the Pacific Ocean, we demonstrate that some individuals emerge as leaders within the population and that this behavioural coordination is predicted by both sex and the duration of co-occurrences between conspecifics. In doing so, we provide the first evidence of long-term, spatially extensive social processes in wild sharks. To achieve these results, we interrogate simulated and real tracking data with the explicit purpose of drawing attention to the key considerations in the use and interpretation of inference methods and their impact on resultant social structure. We provide a modified translation of the GMMEvents method for R, including new analyses quantifying the directionality and duration of social events with the aim of encouraging the careful use of these methods more widely in less tractable social animal systems but where passive telemetry is already widespread. © 2016 The Authors.
Inferring animal social networks and leadership: applications for passive monitoring arrays
Papastamatiou, Yannis P.; Freeman, Robin
2016-01-01
Analyses of animal social networks have frequently benefited from techniques derived from other disciplines. Recently, machine learning algorithms have been adopted to infer social associations from time-series data gathered using remote, telemetry systems situated at provisioning sites. We adapt and modify existing inference methods to reveal the underlying social structure of wide-ranging marine predators moving through spatial arrays of passive acoustic receivers. From six months of tracking data for grey reef sharks (Carcharhinus amblyrhynchos) at Palmyra atoll in the Pacific Ocean, we demonstrate that some individuals emerge as leaders within the population and that this behavioural coordination is predicted by both sex and the duration of co-occurrences between conspecifics. In doing so, we provide the first evidence of long-term, spatially extensive social processes in wild sharks. To achieve these results, we interrogate simulated and real tracking data with the explicit purpose of drawing attention to the key considerations in the use and interpretation of inference methods and their impact on resultant social structure. We provide a modified translation of the GMMEvents method for R, including new analyses quantifying the directionality and duration of social events with the aim of encouraging the careful use of these methods more widely in less tractable social animal systems but where passive telemetry is already widespread. PMID:27881803
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.
Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G
2012-10-02
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Cusimano, Natalie; Sousa, Aretuza; Renner, Susanne S.
2012-01-01
Background and Aims For 84 years, botanists have relied on calculating the highest common factor for series of haploid chromosome numbers to arrive at a so-called basic number, x. This was done without consistent (reproducible) reference to species relationships and frequencies of different numbers in a clade. Likelihood models that treat polyploidy, chromosome fusion and fission as events with particular probabilities now allow reconstruction of ancestral chromosome numbers in an explicit framework. We have used a modelling approach to reconstruct chromosome number change in the large monocot family Araceae and to test earlier hypotheses about basic numbers in the family. Methods Using a maximum likelihood approach and chromosome counts for 26 % of the 3300 species of Araceae and representative numbers for each of the other 13 families of Alismatales, polyploidization events and single chromosome changes were inferred on a genus-level phylogenetic tree for 113 of the 117 genera of Araceae. Key Results The previously inferred basic numbers x = 14 and x = 7 are rejected. Instead, maximum likelihood optimization revealed an ancestral haploid chromosome number of n = 16, Bayesian inference of n = 18. Chromosome fusion (loss) is the predominant inferred event, whereas polyploidization events occurred less frequently and mainly towards the tips of the tree. Conclusions The bias towards low basic numbers (x) introduced by the algebraic approach to inferring chromosome number changes, prevalent among botanists, may have contributed to an unrealistic picture of ancestral chromosome numbers in many plant clades. The availability of robust quantitative methods for reconstructing ancestral chromosome numbers on molecular phylogenetic trees (with or without branch length information), with confidence statistics, makes the calculation of x an obsolete approach, at least when applied to large clades. PMID:22210850
Fisher, Charles K.; Mehta, Pankaj
2014-01-01
Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1) a correlation between the abundances of two species does not imply that those species are interacting, 2) the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3) errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called “errors-in-variables”. Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS), that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct “keystone species”, Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in the human gut microbiome. PMID:25054627
Zeng, Nianyin; Wang, Zidong; Li, Yurong; Du, Min; Cao, Jie; Liu, Xiaohui
2013-12-01
In this paper, the expectation maximization (EM) algorithm is applied to the modeling of the nano-gold immunochromatographic assay (nano-GICA) via available time series of the measured signal intensities of the test and control lines. The model for the nano-GICA is developed as the stochastic dynamic model that consists of a first-order autoregressive stochastic dynamic process and a noisy measurement. By using the EM algorithm, the model parameters, the actual signal intensities of the test and control lines, as well as the noise intensity can be identified simultaneously. Three different time series data sets concerning the target concentrations are employed to demonstrate the effectiveness of the introduced algorithm. Several indices are also proposed to evaluate the inferred models. It is shown that the model fits the data very well.
W. Cohen; H. Andersen; S. Healey; G. Moisen; T. Schroeder; C. Woodall; G. Domke; Z. Yang; S. Stehman; R. Kennedy; C. Woodcock; Z. Zhu; J. Vogelmann; D. Steinwand; C. Huang
2014-01-01
The authors are developing a REDD+ MRV system that tests different biomass estimation frameworks and components. Design-based inference from a costly fi eld plot network was compared to sampling with LiDAR strips and a smaller set of plots in combination with Landsat for disturbance monitoring. Biomass estimation uncertainties associated with these different data sets...
GATE: software for the analysis and visualization of high-dimensional time series expression data.
MacArthur, Ben D; Lachmann, Alexander; Lemischka, Ihor R; Ma'ayan, Avi
2010-01-01
We present Grid Analysis of Time series Expression (GATE), an integrated computational software platform for the analysis and visualization of high-dimensional biomolecular time series. GATE uses a correlation-based clustering algorithm to arrange molecular time series on a two-dimensional hexagonal array and dynamically colors individual hexagons according to the expression level of the molecular component to which they are assigned, to create animated movies of systems-level molecular regulatory dynamics. In order to infer potential regulatory control mechanisms from patterns of correlation, GATE also allows interactive interroga-tion of movies against a wide variety of prior knowledge datasets. GATE movies can be paused and are interactive, allowing users to reconstruct networks and perform functional enrichment analyses. Movies created with GATE can be saved in Flash format and can be inserted directly into PDF manuscript files as interactive figures. GATE is available for download and is free for academic use from http://amp.pharm.mssm.edu/maayan-lab/gate.htm
Modelling short time series in metabolomics: a functional data analysis approach.
Montana, Giovanni; Berk, Maurice; Ebbels, Tim
2011-01-01
Metabolomics is the study of the complement of small molecule metabolites in cells, biofluids and tissues. Many metabolomic experiments are designed to compare changes observed over time under two or more experimental conditions (e.g. a control and drug-treated group), thus producing time course data. Models from traditional time series analysis are often unsuitable because, by design, only very few time points are available and there are a high number of missing values. We propose a functional data analysis approach for modelling short time series arising in metabolomic studies which overcomes these obstacles. Our model assumes that each observed time series is a smooth random curve, and we propose a statistical approach for inferring this curve from repeated measurements taken on the experimental units. A test statistic for detecting differences between temporal profiles associated with two experimental conditions is then presented. The methodology has been applied to NMR spectroscopy data collected in a pre-clinical toxicology study.
Targeted numerical simulations of binary black holes for GW170104
NASA Astrophysics Data System (ADS)
Healy, J.; Lange, J.; O'Shaughnessy, R.; Lousto, C. O.; Campanelli, M.; Williamson, A. R.; Zlochower, Y.; Calderón Bustillo, J.; Clark, J. A.; Evans, C.; Ferguson, D.; Ghonge, S.; Jani, K.; Khamesra, B.; Laguna, P.; Shoemaker, D. M.; Boyle, M.; García, A.; Hemberger, D. A.; Kidder, L. E.; Kumar, P.; Lovelace, G.; Pfeiffer, H. P.; Scheel, M. A.; Teukolsky, S. A.
2018-03-01
In response to LIGO's observation of GW170104, we performed a series of full numerical simulations of binary black holes, each designed to replicate likely realizations of its dynamics and radiation. These simulations have been performed at multiple resolutions and with two independent techniques to solve Einstein's equations. For the nonprecessing and precessing simulations, we demonstrate the two techniques agree mode by mode, at a precision substantially in excess of statistical uncertainties in current LIGO's observations. Conversely, we demonstrate our full numerical solutions contain information which is not accurately captured with the approximate phenomenological models commonly used to infer compact binary parameters. To quantify the impact of these differences on parameter inference for GW170104 specifically, we compare the predictions of our simulations and these approximate models to LIGO's observations of GW170104.
Carbon monoxide mixing ratio inference from gas filter radiometer data
NASA Technical Reports Server (NTRS)
Wallio, H. A.; Reichle, H. G., Jr.; Casas, J. C.; Saylor, M. S.; Gormsen, B. B.
1983-01-01
A new algorithm has been developed which permits, for the first time, real time data reduction of nadir measurements taken with a gas filter correlation radiometer to determine tropospheric carbon monoxide concentrations. The algorithm significantly reduces the complexity of the equations to be solved while providing accuracy comparable to line-by-line calculations. The method is based on a regression analysis technique using a truncated power series representation of the primary instrument output signals to infer directly a weighted average of trace gas concentration. The results produced by a microcomputer-based implementation of this technique are compared with those produced by the more rigorous line-by-line methods. This algorithm has been used in the reduction of Measurement of Air Pollution from Satellites, Shuttle, and aircraft data.
Distinguishing signatures of determinism and stochasticity in spiking complex systems
Aragoneses, Andrés; Rubido, Nicolás; Tiana-Alsina, Jordi; Torrent, M. C.; Masoller, Cristina
2013-01-01
We describe a method to infer signatures of determinism and stochasticity in the sequence of apparently random intensity dropouts emitted by a semiconductor laser with optical feedback. The method uses ordinal time-series analysis to classify experimental data of inter-dropout-intervals (IDIs) in two categories that display statistically significant different features. Despite the apparent randomness of the dropout events, one IDI category is consistent with waiting times in a resting state until noise triggers a dropout, and the other is consistent with dropouts occurring during the return to the resting state, which have a clear deterministic component. The method we describe can be a powerful tool for inferring signatures of determinism in the dynamics of complex systems in noisy environments, at an event-level description of their dynamics.
Inferring Stop-Locations from WiFi.
Wind, David Kofoed; Sapiezynski, Piotr; Furman, Magdalena Anna; Lehmann, Sune
2016-01-01
Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring 'stop locations' (places of stationarity) from GPS or CDR data, or on detection of state (static/active). In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants' GPS data as well as ground truth data collected during a two month period.
Reconstructing lake ice cover in subarctic lakes using a diatom-based inference model
NASA Astrophysics Data System (ADS)
Weckström, Jan; Hanhijärvi, Sami; Forsström, Laura; Kuusisto, Esko; Korhola, Atte
2014-03-01
A new quantitative diatom-based lake ice cover inference model was developed to reconstruct past ice cover histories and applied to four subarctic lakes. The used ice cover model is based on a calculated melting degree day value of +130 and a freezing degree day value of -30 for each lake. The reconstructed Holocene ice cover duration histories show similar trends to the independently reconstructed regional air temperature history. The ice cover duration was around 7 days shorter than the average ice cover duration during the warmer early Holocene (approximately 10 to 6.5 calibrated kyr B.P.) and around 3-5 days longer during the cool Little Ice Age (approximately 500 to 100 calibrated yr B.P.). Although the recent climate warming is represented by only 2-3 samples in the sediment series, these show a rising trend in the prolonged ice-free periods of up to 2 days. Diatom-based ice cover inference models can provide a powerful tool to reconstruct past ice cover histories in remote and sensitive areas where no measured data are available.
Network Inference via the Time-Varying Graphical Lasso
Hallac, David; Park, Youngsuk; Boyd, Stephen; Leskovec, Jure
2018-01-01
Many important problems can be modeled as a system of interconnected entities, where each entity is recording time-dependent observations or measurements. In order to spot trends, detect anomalies, and interpret the temporal dynamics of such data, it is essential to understand the relationships between the different entities and how these relationships evolve over time. In this paper, we introduce the time-varying graphical lasso (TVGL), a method of inferring time-varying networks from raw time series data. We cast the problem in terms of estimating a sparse time-varying inverse covariance matrix, which reveals a dynamic network of interdependencies between the entities. Since dynamic network inference is a computationally expensive task, we derive a scalable message-passing algorithm based on the Alternating Direction Method of Multipliers (ADMM) to solve this problem in an efficient way. We also discuss several extensions, including a streaming algorithm to update the model and incorporate new observations in real time. Finally, we evaluate our TVGL algorithm on both real and synthetic datasets, obtaining interpretable results and outperforming state-of-the-art baselines in terms of both accuracy and scalability. PMID:29770256
Inferring Stop-Locations from WiFi
Wind, David Kofoed; Sapiezynski, Piotr; Furman, Magdalena Anna; Lehmann, Sune
2016-01-01
Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring ‘stop locations’ (places of stationarity) from GPS or CDR data, or on detection of state (static/active). In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants’ GPS data as well as ground truth data collected during a two month period. PMID:26901663
Sambo, Francesco; de Oca, Marco A Montes; Di Camillo, Barbara; Toffolo, Gianna; Stützle, Thomas
2012-01-01
Reverse engineering is the problem of inferring the structure of a network of interactions between biological variables from a set of observations. In this paper, we propose an optimization algorithm, called MORE, for the reverse engineering of biological networks from time series data. The model inferred by MORE is a sparse system of nonlinear differential equations, complex enough to realistically describe the dynamics of a biological system. MORE tackles separately the discrete component of the problem, the determination of the biological network topology, and the continuous component of the problem, the strength of the interactions. This approach allows us both to enforce system sparsity, by globally constraining the number of edges, and to integrate a priori information about the structure of the underlying interaction network. Experimental results on simulated and real-world networks show that the mixed discrete/continuous optimization approach of MORE significantly outperforms standard continuous optimization and that MORE is competitive with the state of the art in terms of accuracy of the inferred networks.
NASA Astrophysics Data System (ADS)
Ma, T.; Chen, H.; Patel, P. K.; Schneider, M.; Barrios, M.; Berzak Hopkins, L.; Casey, D.; Chung, H.-K.; Hammel, B.; Jarrott, C.; Nora, R.; Pak, A.; Scott, H.; Spears, B.; Weber, C.
2015-11-01
The inference of ion temperature from neutron spectral measurements in indirect-drive ICF implosions is known to be sensitive to non-thermal velocity distributions in the fuel. The electron temperature (Te) inferred from dopant line ratios should not be sensitive to these bulk motions and hence may be a better measure of the thermal temperature of the hot spot. Here we describe a series of experiments to be conducted on the NIF where a small concentration of a mid-Z dopant (Krypton) is added to the fuel gas. The x-ray spectra is measured and the electron temperature is inferred from Kr line ratios. We also quantify the level of radiative cooling in the hot spot due to this mid-Z dopant. These experiments represent the first direct measurement of hot spot Te using spectroscopy, and we will describe the considerations for applying x-ray spectroscopy in such dense and non-uniform hot spots. This work performed under the auspices of U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Unusually large earthquakes inferred from tsunami deposits along the Kuril trench
Nanayama, F.; Satake, K.; Furukawa, R.; Shimokawa, K.; Atwater, B.F.; Shigeno, K.; Yamaki, S.
2003-01-01
The Pacific plate converges with northeastern Eurasia at a rate of 8-9 m per century along the Kamchatka, Kuril and Japan trenches. Along the southern Kuril trench, which faces the Japanese island of Hokkaido, this fast subduction has recurrently generated earthquakes with magnitudes of up to ???8 over the past two centuries. These historical events, on rupture segments 100-200 km long, have been considered characteristic of Hokkaido's plate-boundary earthquakes. But here we use deposits of prehistoric tsunamis to infer the infrequent occurrence of larger earthquakes generated from longer ruptures. Many of these tsunami deposits form sheets of sand that extend kilometres inland from the deposits of historical tsunamis. Stratigraphic series of extensive sand sheets, intercalated with dated volcanic-ash layers, show that such unusually large tsunamis occurred about every 500 years on average over the past 2,000-7,000 years, most recently ???350 years ago. Numerical simulations of these tsunamis are best explained by earthquakes that individually rupture multiple segments along the southern Kuril trench. We infer that such multi-segment earthquakes persistently recur among a larger number of single-segment events.
Inference of topology and the nature of synapses, and the flow of information in neuronal networks
NASA Astrophysics Data System (ADS)
Borges, F. S.; Lameu, E. L.; Iarosz, K. C.; Protachevicz, P. R.; Caldas, I. L.; Viana, R. L.; Macau, E. E. N.; Batista, A. M.; Baptista, M. S.
2018-02-01
The characterization of neuronal connectivity is one of the most important matters in neuroscience. In this work, we show that a recently proposed informational quantity, the causal mutual information, employed with an appropriate methodology, can be used not only to correctly infer the direction of the underlying physical synapses, but also to identify their excitatory or inhibitory nature, considering easy to handle and measure bivariate time series. The success of our approach relies on a surprising property found in neuronal networks by which nonadjacent neurons do "understand" each other (positive mutual information), however, this exchange of information is not capable of causing effect (zero transfer entropy). Remarkably, inhibitory connections, responsible for enhancing synchronization, transfer more information than excitatory connections, known to enhance entropy in the network. We also demonstrate that our methodology can be used to correctly infer directionality of synapses even in the presence of dynamic and observational Gaussian noise, and is also successful in providing the effective directionality of intermodular connectivity, when only mean fields can be measured.
Sobel, Michael E; Lindquist, Martin A
2014-07-01
Functional magnetic resonance imaging (fMRI) has facilitated major advances in understanding human brain function. Neuroscientists are interested in using fMRI to study the effects of external stimuli on brain activity and causal relationships among brain regions, but have not stated what is meant by causation or defined the effects they purport to estimate. Building on Rubin's causal model, we construct a framework for causal inference using blood oxygenation level dependent (BOLD) fMRI time series data. In the usual statistical literature on causal inference, potential outcomes, assumed to be measured without systematic error, are used to define unit and average causal effects. However, in general the potential BOLD responses are measured with stimulus dependent systematic error. Thus we define unit and average causal effects that are free of systematic error. In contrast to the usual case of a randomized experiment where adjustment for intermediate outcomes leads to biased estimates of treatment effects (Rosenbaum, 1984), here the failure to adjust for task dependent systematic error leads to biased estimates. We therefore adjust for systematic error using measured "noise covariates" , using a linear mixed model to estimate the effects and the systematic error. Our results are important for neuroscientists, who typically do not adjust for systematic error. They should also prove useful to researchers in other areas where responses are measured with error and in fields where large amounts of data are collected on relatively few subjects. To illustrate our approach, we re-analyze data from a social evaluative threat task, comparing the findings with results that ignore systematic error.
Detectability of Granger causality for subsampled continuous-time neurophysiological processes.
Barnett, Lionel; Seth, Anil K
2017-01-01
Granger causality is well established within the neurosciences for inference of directed functional connectivity from neurophysiological data. These data usually consist of time series which subsample a continuous-time biophysiological process. While it is well known that subsampling can lead to imputation of spurious causal connections where none exist, less is known about the effects of subsampling on the ability to reliably detect causal connections which do exist. We present a theoretical analysis of the effects of subsampling on Granger-causal inference. Neurophysiological processes typically feature signal propagation delays on multiple time scales; accordingly, we base our analysis on a distributed-lag, continuous-time stochastic model, and consider Granger causality in continuous time at finite prediction horizons. Via exact analytical solutions, we identify relationships among sampling frequency, underlying causal time scales and detectability of causalities. We reveal complex interactions between the time scale(s) of neural signal propagation and sampling frequency. We demonstrate that detectability decays exponentially as the sample time interval increases beyond causal delay times, identify detectability "black spots" and "sweet spots", and show that downsampling may potentially improve detectability. We also demonstrate that the invariance of Granger causality under causal, invertible filtering fails at finite prediction horizons, with particular implications for inference of Granger causality from fMRI data. Our analysis emphasises that sampling rates for causal analysis of neurophysiological time series should be informed by domain-specific time scales, and that state-space modelling should be preferred to purely autoregressive modelling. On the basis of a very general model that captures the structure of neurophysiological processes, we are able to help identify confounds, and offer practical insights, for successful detection of causal connectivity from neurophysiological recordings. Copyright © 2016 Elsevier B.V. All rights reserved.
Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D
2004-10-01
Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and flexibility. To determine whether biased reconstructions using optimization methods might affect inferences of functional properties, ancestral primate mitochondrial tRNA sequences were inferred and helix-forming propensities for conserved pairs were evaluated in silico. For ambiguously reconstructed nucleotides at sites with high base composition variability, ancestral tRNA sequences from Bayesian analyses were more compatible with canonical base pairing than were those inferred by other methods. Thus, nucleotide bias in reconstructed sequences apparently can lead to serious bias and inaccuracies in functional predictions.
Inferring genetic interactions via a nonlinear model and an optimization algorithm.
Chen, Chung-Ming; Lee, Chih; Chuang, Cheng-Long; Wang, Chia-Chang; Shieh, Grace S
2010-02-26
Biochemical pathways are gradually becoming recognized as central to complex human diseases and recently genetic/transcriptional interactions have been shown to be able to predict partial pathways. With the abundant information made available by microarray gene expression data (MGED), nonlinear modeling of these interactions is now feasible. Two of the latest advances in nonlinear modeling used sigmoid models to depict transcriptional interaction of a transcription factor (TF) for a target gene, but do not model cooperative or competitive interactions of several TFs for a target. An S-shape model and an optimization algorithm (GASA) were developed to infer genetic interactions/transcriptional regulation of several genes simultaneously using MGED. GASA consists of a genetic algorithm (GA) and a simulated annealing (SA) algorithm, which is enhanced by a steepest gradient descent algorithm to avoid being trapped in local minimum. Using simulated data with various degrees of noise, we studied how GASA with two model selection criteria and two search spaces performed. Furthermore, GASA was shown to outperform network component analysis, the time series network inference algorithm (TSNI), GA with regular GA (GAGA) and GA with regular SA. Two applications are demonstrated. First, GASA is applied to infer a subnetwork of human T-cell apoptosis. Several of the predicted interactions are supported by the literature. Second, GASA was applied to infer the transcriptional factors of 34 cell cycle regulated targets in S. cerevisiae, and GASA performed better than one of the latest advances in nonlinear modeling, GAGA and TSNI. Moreover, GASA is able to predict multiple transcription factors for certain targets, and these results coincide with experiments confirmed data in YEASTRACT. GASA is shown to infer both genetic interactions and transcriptional regulatory interactions well. In particular, GASA seems able to characterize the nonlinear mechanism of transcriptional regulatory interactions (TIs) in yeast, and may be applied to infer TIs in other organisms. The predicted genetic interactions of a subnetwork of human T-cell apoptosis coincide with existing partial pathways, suggesting the potential of GASA on inferring biochemical pathways.
GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns
Senin, Pavel; Lin, Jessica; Wang, Xing; ...
2018-02-23
The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discoverymore » that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.« less
GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns
DOE Office of Scientific and Technical Information (OSTI.GOV)
Senin, Pavel; Lin, Jessica; Wang, Xing
The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discoverymore » that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.« less
NASA Astrophysics Data System (ADS)
Ma, Lin; Chabaux, Francois; Pelt, Eric; Blaes, Estelle; Jin, Lixin; Brantley, Susan
2010-08-01
In the Critical Zone where rocks and life interact, bedrock equilibrates to Earth surface conditions, transforming to regolith. The factors that control the rates and mechanisms of formation of regolith, defined here as material that can be augered, are still not fully understood. To quantify regolith formation rates on shale lithology, we measured uranium-series (U-series) isotopes ( 238U, 234U, and 230Th) in three weathering profiles along a planar hillslope at the Susquehanna/Shale Hills Observatory (SSHO) in central Pennsylvania. All regolith samples show significant U-series disequilibrium: ( 234U/ 238U) and ( 230Th/ 238U) activity ratios range from 0.934 to 1.072 and from 0.903 to 1.096, respectively. These values display depth trends that are consistent with fractionation of U-series isotopes during chemical weathering and element transport, i.e., the relative mobility decreases in the order 234U > 238U > 230Th. The activity ratios observed in the regolith samples are explained by i) loss of U-series isotopes during water-rock interactions and ii) re-deposition of U-series isotopes downslope. Loss of U and Th initiates in the meter-thick zone of "bedrock" that cannot be augered but that nonetheless consists of up to 40% clay/silt/sand inferred to have lost K, Mg, Al, and Fe. Apparent equivalent regolith production rates calculated with these isotopes for these profiles decrease exponentially from 45 m/Myr to 17 m/Myr, with increasing regolith thickness from the ridge top to the valley floor. With increasing distance from the ridge top toward the valley, apparent equivalent regolith residence times increase from 7 kyr to 40 kyr. Given that the SSHO experienced peri-glacial climate ˜ 15 kyr ago and has a catchment-wide averaged erosion rate of ˜ 15 m/Myr as inferred from cosmogenic 10Be, we conclude that the hillslope retains regolith formed before the peri-glacial period and is not at geomorphologic steady state. Both chemical weathering reactions of clay minerals and translocation of fine particles/colloids are shown to contribute to mass loss of U and Th from the regolith, consistent with major element data at SSHO. This research documents a case study where U-series isotopes are used to constrain the time scales of chemical weathering and regolith production rates. Regolith production rates at the SSHO should be useful as a reference value for future work at other weathering localities.
Quasi-experimental study designs series-paper 1: introduction: two historical lineages.
Bärnighausen, Till; Røttingen, John-Arne; Rockers, Peter; Shemilt, Ian; Tugwell, Peter
2017-09-01
The objective of this study was to contrast the historical development of experiments and quasi-experiments and provide the motivation for a journal series on quasi-experimental designs in health research. A short historical narrative, with concrete examples, and arguments based on an understanding of the practice of health research and evidence synthesis. Health research has played a key role in developing today's gold standard for causal inference-the randomized controlled multiply blinded trial. Historically, allocation approaches developed from convenience and purposive allocation to alternate and, finally, to random allocation. This development was motivated both by concerns for manipulation in allocation as well as statistical and theoretical developments demonstrating the power of randomization in creating counterfactuals for causal inference. In contrast to the sequential development of experiments, quasi-experiments originated at very different points in time, from very different scientific perspectives, and with frequent and long interruptions in their methodological development. Health researchers have only recently started to recognize the value of quasi-experiments for generating novel insights on causal relationships. While quasi-experiments are unlikely to replace experiments in generating the efficacy and safety evidence required for clinical guidelines and regulatory approval of medical technologies, quasi-experiments can play an important role in establishing the effectiveness of health care practice, programs, and policies. The papers in this series describe and discuss a range of important issues in utilizing quasi-experimental designs for primary research and quasi-experimental results for evidence synthesis. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
West, N.; Kirby, E.; Ma, L.; Bierman, P. R.
2013-12-01
Regolith-mantled hillslopes are ubiquitous features of most temperate landscapes, and their morphology reflects the climatically, biologically, and tectonically mediated interplay between regolith production and downslope transport. Despite intensive research, few studies have quantified both of these mass fluxes in the same field site. Here, we exploit two isotopic systems to quantify regolith production and transport within the Susquehanna Shale Hills Critical Zone Observatory (SSHO), in central Pennsylvania. We present an analysis of 131 meteoric 10Be measurements from regolith and bedrock to quantify rates of regolith transport, and compare these data with previously determined regolith production rates, measured using uranium-series isotopes. Regolith flux inferred from meteoric 10Be varies linearly with topographic gradient (determined from high-resolution LiDAR-based topography) along the upper portions of hillslopes in and adjacent to SSHO. However, regolith flux appears to depend on the product of gradient and regolith depth where regolith is thick, near the base of hillslopes. Meteoric 10Be inventories along 4 ridgetops within and adjacent to the SSHO indicate regolith residence times ranging from ~ 9 - 15 ky, similar to residence times inferred from U-series isotopes (6.7 × 3 ky - 15 × 8 ky). Similarly, the downslope flux of regolith (~ 500 - 1,000 m2/My) nearly balances production (850 × 22 m2/My - 960 × 530 m2/My). The combination of our results with U-series derived regolith production rates implies that regolith production and erosion rates along ridgecrests in the SSHO may be approaching steady state conditions over the Holocene.
Using permutation tests to enhance causal inference in interrupted time series analysis.
Linden, Ariel
2018-06-01
Interrupted time series analysis (ITSA) is an evaluation methodology in which a single treatment unit's outcome is studied serially over time and the intervention is expected to "interrupt" the level and/or trend of that outcome. The internal validity is strengthened considerably when the treated unit is contrasted with a comparable control group. In this paper, we introduce a robustness check based on permutation tests to further improve causal inference. We evaluate the effect of California's Proposition 99 for reducing cigarette sales by iteratively casting each nontreated state into the role of "treated," creating a comparable control group using the ITSAMATCH package in Stata, and then evaluating treatment effects using ITSA regression. If statistically significant "treatment effects" are estimated for pseudotreated states, then any significant changes in the outcome of the actual treatment unit (California) cannot be attributed to the intervention. We perform these analyses setting the cutpoint significance level to P > .40 for identifying balanced matches (the highest threshold possible for which controls could still be found for California) and use the difference in differences of trends as the treatment effect estimator. Only California attained a statistically significant treatment effect, strengthening confidence in the conclusion that Proposition 99 reduced cigarette sales. The proposed permutation testing framework provides an additional robustness check to either support or refute a treatment effect identified in for the true treated unit in ITSA. Given its value and ease of implementation, this framework should be considered as a standard robustness test in all multiple group interrupted time series analyses. © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Birkel, Christian; Soulsby, Chris; Malcolm, Iain; Tetzlaff, Doerthe
2013-09-01
We inferred in-stream ecosystem processes in terms of photosynthetic productivity (P), system respiration (R), and reaeration capacity (RC) from a five parameter numerical oxygen mass balance model driven by radiation, stream and air temperature, and stream depth. This was calibrated to high-resolution (15 min), long-term (2.5 years) dissolved oxygen (DO) time series for moorland and forest reaches of a third-order montane stream in Scotland. The model was multicriteria calibrated to continuous 24 h periods within the time series to identify behavioral simulations representative of ecosystem functioning. Results were evaluated using a seasonal regional sensitivity analysis and a colinearity index for parameter sensitivity. This showed that >95 % of the behavioral models for the moorland and forest sites were identifiable and able to infer in-stream processes from the DO time series for around 40% and 32% of the time period, respectively. Monthly P/R ratios <1 indicate a heterotrophic system with both sites exhibiting similar temporal patterns; with a maximum in February and a second peak during summer months. However, the estimated net ecosystem productivity suggests that the moorland reach without riparian tree cover is likely to be a much larger source of carbon to the atmosphere (122 mmol C m-2 d-1) compared to the forested reach (64 mmol C m-2 d-1). We conclude that such process-based oxygen mass balance models may be transferable tools for investigating other systems; specifically, well-oxygenated upland channels with high hydraulic roughness and lacking reaeration measurements.
NASA Technical Reports Server (NTRS)
Gomberg, R. I.; Buglia, J. J.
1979-01-01
An iterative technique which recovers density profiles in a nonhomogeneous absorbing atmosphere is derived. The technique is based on the concept of factoring a function of the density profile into the product of a known term and a term which is not known, but whose power series expansion can be found. This series converges rapidly under a wide range of conditions. A demonstration example of simulated data from a high resolution infrared heterodyne instrument is inverted. For the examples studied, the technique is shown to be capable of extracting features of ozone profiles in the troposphere and to be particularly stable.
Strakova, Eva; Zikova, Alice; Vohradsky, Jiri
2014-01-01
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
McCartan, L.; Owens, J.P.; Blackwelder, B. W.; Szabo, B. J.; Belknap, D.F.; Kriausakul, N.; Mitterer, R.M.; Wehmiller, J.F.
1982-01-01
The results of an integrated study comprising litho- and biostratigraphic investigations, uranium-series coral dating, amino acid racemization in molluscs, and paleomagnetic measurements are compared to ascertain relative and absolute ages of Pleistocene deposits of the Atlantic Coastal Plain in North and South Carolina. Four depositional events are inferred for South Carolina and two for North Carolina by all methods. The data suggest that there are four Pleistocene units containing corals that have been dated at about 100,000 yr, 200,000 yr, 450,000 yr, and over 1,000,000 yr. Some conflicts exist between the different methods regarding the correlation of the younger of these depositional events between Charleston and Myrtle Beach. Lack of good uranium-series dates for the younger material at Myrtle Beach makes the correlation with the deposits at Charleston more difficult. ?? 1982.
Estimating the effective spatial resolution of an AVHRR time series
Meyer, D.J.
1996-01-01
A method is proposed to estimate the spatial degradation of geometrically rectified AVHRR data resulting from misregistration and off-nadir viewing, and to infer the cumulative effect of these degradations over time. Misregistrations are measured using high resolution imagery as a geometric reference, and pixel sizes are computed directly from satellite zenith angles. The influence or neighbouring features on a nominal 1 km by 1 km pixel over a given site is estimated from the above information, and expressed as a spatial distribution whose spatial frequency response is used to define an effective field-of-view (EFOV) for a time series. In a demonstration of the technique applied to images from the Conterminous U.S. AVHRR data set, an EFOV of 3·1km in the east-west dimension and 19 km in the north-south dimension was estimated for a time series accumulated over a grasslands test site.
NASA Astrophysics Data System (ADS)
McKinney, B. A.; Crowe, J. E., Jr.; Voss, H. U.; Crooke, P. S.; Barney, N.; Moore, J. H.
2006-02-01
We introduce a grammar-based hybrid approach to reverse engineering nonlinear ordinary differential equation models from observed time series. This hybrid approach combines a genetic algorithm to search the space of model architectures with a Kalman filter to estimate the model parameters. Domain-specific knowledge is used in a context-free grammar to restrict the search space for the functional form of the target model. We find that the hybrid approach outperforms a pure evolutionary algorithm method, and we observe features in the evolution of the dynamical models that correspond with the emergence of favorable model components. We apply the hybrid method to both artificially generated time series and experimentally observed protein levels from subjects who received the smallpox vaccine. From the observed data, we infer a cytokine protein interaction network for an individual’s response to the smallpox vaccine.
A Theory of Diagnostic Inference: Judging Causality.
1983-08-01
received considerable attention from a variety of perspectives, e.g., child development ( Piaget , 2 1974; Shultz, 1982); social psychology (Kelley, 1973...tives? Following the development of a theory to answer these questions, we present a series of experiments to test the various components of the... development of schemas and, conditional on such schemas, they are used to modify and expand on prior theories . This implies that the relations between
ERIC Educational Resources Information Center
Bassiri, Dina
2016-01-01
One outcome of the implementation of No Child Left Behind Act of 2001 and its call for better accountability in public schools across the nation has been the use of student assessment data in measuring schools' effectiveness. In general, inferences about schools' effectiveness depend on the type of statistical model used to link student assessment…
Reliability of a Measure of Institutional Discrimination against Minorities
1979-12-01
samples are presented. The first is based upon classical statistical theory and the second derives from a series of computer-generated Monte Carlo...Institutional racism and sexism . Englewood Cliffs, N. J.: Prentice-Hall, Inc., 1978. Hays, W. L. and Winkler, R. L. Statistics : probability, inference... statistical measure of the e of institutional discrimination are discussed. Two methods of dealing with the problem of reliability of the measure in small
Reveal, A General Reverse Engineering Algorithm for Inference of Genetic Network Architectures
NASA Technical Reports Server (NTRS)
Liang, Shoudan; Fuhrman, Stefanie; Somogyi, Roland
1998-01-01
Given the immanent gene expression mapping covering whole genomes during development, health and disease, we seek computational methods to maximize functional inference from such large data sets. Is it possible, in principle, to completely infer a complex regulatory network architecture from input/output patterns of its variables? We investigated this possibility using binary models of genetic networks. Trajectories, or state transition tables of Boolean nets, resemble time series of gene expression. By systematically analyzing the mutual information between input states and output states, one is able to infer the sets of input elements controlling each element or gene in the network. This process is unequivocal and exact for complete state transition tables. We implemented this REVerse Engineering ALgorithm (REVEAL) in a C program, and found the problem to be tractable within the conditions tested so far. For n = 50 (elements) and k = 3 (inputs per element), the analysis of incomplete state transition tables (100 state transition pairs out of a possible 10(exp 15)) reliably produced the original rule and wiring sets. While this study is limited to synchronous Boolean networks, the algorithm is generalizable to include multi-state models, essentially allowing direct application to realistic biological data sets. The ability to adequately solve the inverse problem may enable in-depth analysis of complex dynamic systems in biology and other fields.
Sawatzky, Richard; Chan, Eric K H; Zumbo, Bruno D; Ahmed, Sara; Bartlett, Susan J; Bingham, Clifton O; Gardner, William; Jutai, Jeffrey; Kuspinar, Ayse; Sajobi, Tolulope; Lix, Lisa M
2017-09-01
Obtaining the patient's view about the outcome of care is an essential component of patient-centered care. Many patient-reported outcome (PRO) instruments for different purposes have been developed since the 1960s. Measurement validation is fundamental in the development, evaluation, and use of PRO instruments. This paper provides a review of modern perspectives of measurement validation in relation to the followings three questions as applied to PROs: (1) What evidence is needed to warrant comparisons between groups and individuals? (2) What evidence is needed to warrant comparisons over time? and (3) What are the value implications, including personal and societal consequences, of using PRO scores? Measurement validation is an ongoing process that involves the accumulation of evidence regarding the justification of inferences, actions, and decisions based on measurement scores. These include inferences pertaining to comparisons between groups and comparisons over time as well as consideration of value implications of using PRO scores. Personal and societal consequences must be examined as part of a comprehensive approach to measurement validation. The answers to these three questions are fundamental to the the validity of different types of inferences, actions, and decisions made on PRO scores in health research, health care administration, and clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.
Brüniche-Olsen, Anna; Austin, Jeremy J.; Jones, Menna E.; Holland, Barbara R.; Burridge, Christopher P.
2016-01-01
Detecting loci under selection is an important task in evolutionary biology. In conservation genetics detecting selection is key to investigating adaptation to the spread of infectious disease. Loci under selection can be detected on a spatial scale, accounting for differences in demographic history among populations, or on a temporal scale, tracing changes in allele frequencies over time. Here we use these two approaches to investigate selective responses to the spread of an infectious cancer—devil facial tumor disease (DFTD)—that since 1996 has ravaged the Tasmanian devil (Sarcophilus harrisii). Using time-series ‘restriction site associated DNA’ (RAD) markers from populations pre- and post DFTD arrival, and DFTD free populations, we infer loci under selection due to DFTD and investigate signatures of selection that are incongruent among methods, populations, and times. The lack of congruence among populations influenced by DFTD with respect to inferred loci under selection, and the direction of that selection, fail to implicate a consistent selective role for DFTD. Instead genetic drift is more likely driving the observed allele frequency changes over time. Our study illustrates the importance of applying methods with different performance optima e.g. accounting for population structure and background selection, and assessing congruence of the results. PMID:26930198
Design of fuzzy cognitive maps using neural networks for predicting chaotic time series.
Song, H J; Miao, C Y; Shen, Z Q; Roel, W; Maja, D H; Francky, C
2010-12-01
As a powerful paradigm for knowledge representation and a simulation mechanism applicable to numerous research and application fields, Fuzzy Cognitive Maps (FCMs) have attracted a great deal of attention from various research communities. However, the traditional FCMs do not provide efficient methods to determine the states of the investigated system and to quantify causalities which are the very foundation of the FCM theory. Therefore in many cases, constructing FCMs for complex causal systems greatly depends on expert knowledge. The manually developed models have a substantial shortcoming due to model subjectivity and difficulties with accessing its reliability. In this paper, we propose a fuzzy neural network to enhance the learning ability of FCMs so that the automatic determination of membership functions and quantification of causalities can be incorporated with the inference mechanism of conventional FCMs. In this manner, FCM models of the investigated systems can be automatically constructed from data, and therefore are independent of the experts. Furthermore, we employ mutual subsethood to define and describe the causalities in FCMs. It provides more explicit interpretation for causalities in FCMs and makes the inference process easier to understand. To validate the performance, the proposed approach is tested in predicting chaotic time series. The simulation studies show the effectiveness of the proposed approach. Copyright © 2010 Elsevier Ltd. All rights reserved.
TIME-DOMAIN METHODS FOR DIFFUSIVE TRANSPORT IN SOFT MATTER
Fricks, John; Yao, Lingxing; Elston, Timothy C.; Gregory Forest, And M.
2015-01-01
Passive microrheology [12] utilizes measurements of noisy, entropic fluctuations (i.e., diffusive properties) of micron-scale spheres in soft matter to infer bulk frequency-dependent loss and storage moduli. Here, we are concerned exclusively with diffusion of Brownian particles in viscoelastic media, for which the Mason-Weitz theoretical-experimental protocol is ideal, and the more challenging inference of bulk viscoelastic moduli is decoupled. The diffusive theory begins with a generalized Langevin equation (GLE) with a memory drag law specified by a kernel [7, 16, 22, 23]. We start with a discrete formulation of the GLE as an autoregressive stochastic process governing microbead paths measured by particle tracking. For the inverse problem (recovery of the memory kernel from experimental data) we apply time series analysis (maximum likelihood estimators via the Kalman filter) directly to bead position data, an alternative to formulas based on mean-squared displacement statistics in frequency space. For direct modeling, we present statistically exact GLE algorithms for individual particle paths as well as statistical correlations for displacement and velocity. Our time-domain methods rest upon a generalization of well-known results for a single-mode exponential kernel [1, 7, 22, 23] to an arbitrary M-mode exponential series, for which the GLE is transformed to a vector Ornstein-Uhlenbeck process. PMID:26412904
Modeling Individual Cyclic Variation in Human Behavior.
Pierson, Emma; Althoff, Tim; Leskovec, Jure
2018-04-01
Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, and the menstrual cycle. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyH-MMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with both discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to accommodate variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets-of human menstrual cycle symptoms and physical activity tracking data-yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.
Modeling Individual Cyclic Variation in Human Behavior
Pierson, Emma; Althoff, Tim; Leskovec, Jure
2018-01-01
Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, and the menstrual cycle. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyH-MMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with both discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to accommodate variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets—of human menstrual cycle symptoms and physical activity tracking data—yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model. PMID:29780976
Quantifying temporal change in biodiversity: challenges and opportunities
Dornelas, Maria; Magurran, Anne E.; Buckland, Stephen T.; Chao, Anne; Chazdon, Robin L.; Colwell, Robert K.; Curtis, Tom; Gaston, Kevin J.; Gotelli, Nicholas J.; Kosnik, Matthew A.; McGill, Brian; McCune, Jenny L.; Morlon, Hélène; Mumby, Peter J.; Øvreås, Lise; Studeny, Angelika; Vellend, Mark
2013-01-01
Growing concern about biodiversity loss underscores the need to quantify and understand temporal change. Here, we review the opportunities presented by biodiversity time series, and address three related issues: (i) recognizing the characteristics of temporal data; (ii) selecting appropriate statistical procedures for analysing temporal data; and (iii) inferring and forecasting biodiversity change. With regard to the first issue, we draw attention to defining characteristics of biodiversity time series—lack of physical boundaries, uni-dimensionality, autocorrelation and directionality—that inform the choice of analytic methods. Second, we explore methods of quantifying change in biodiversity at different timescales, noting that autocorrelation can be viewed as a feature that sheds light on the underlying structure of temporal change. Finally, we address the transition from inferring to forecasting biodiversity change, highlighting potential pitfalls associated with phase-shifts and novel conditions. PMID:23097514
Infrared photometry of the dwarf nova V2051 Ophiuchi - I. The mass-donor star and the distance
NASA Astrophysics Data System (ADS)
Wojcikiewicz, Eduardo; Baptista, Raymundo; Ribeiro, Tiago
2018-04-01
We report the analysis of time series of infrared JHKs photometry of the dwarf nova V2051 Oph in quiescence. We modelled the ellipsoidal variations caused by the distorted mass-donor star to infer its JHKs fluxes. From its infrared colours, we estimate a spectral type of M(8.0 ± 1.5) and an equivalent blackbody temperature of TBB = (2700 ± 270) K. We used the Barnes & Evans relation to infer a photometric parallax distance of dBE = (102 ± 16) pc to the binary. At this short distance, the corresponding accretion disc temperatures in outburst are too low to be explained by the disc-instability model for dwarf nova outbursts, underscoring a previous suggestion that the outbursts of this binary are powered by mass-transfer bursts.
Watanabe, Hiroshi
2012-01-01
Procedures of statistical analysis are reviewed to provide an overview of applications of statistics for general use. Topics that are dealt with are inference on a population, comparison of two populations with respect to means and probabilities, and multiple comparisons. This study is the second part of series in which we survey medical statistics. Arguments related to statistical associations and regressions will be made in subsequent papers.
2009-06-01
isolation. In addition to being inherently multi-modal, human perception takes advantages of multiple sources of information within a single modality...restric- tion was reasonable for the applications we looked at. However, consider using a TIM to model a teacher student relationship among moving objects...That is, imagine one teacher object demonstrating a behavior for a student object. The student can observe the teacher and then recreate the behavior
The QBO modulation of the occurrence of the Counter Electrojet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pei-Ren Chen; Yi Luo; Jun Ma
1995-10-15
The authors report long term studies of the geomagnetic field made in India, looking at variations in the horizontal component of the field. In particular they look at the counter electrojet (CEJ), which is an observed reversal in the equatorial electrojet inferred from its impact on the geomagnetic field. They use their long time series datasets to correlate the CEJ with solar cycle variations and with the quasi-biennial oscillation.
ERIC Educational Resources Information Center
Brophy, Jere; And Others
This is the fourth in a series of four reports describing a study of 1,614 junior high school mathematics and English students and 69 of their teachers that was undertaken to discover the effects of different teaching behaviors on cognitive and affective student outcomes. This booklet is the working manual used for coder training and includes…
Shin, Junha; Lee, Insuk
2015-01-01
Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life—Archaea, Bacteria, and Eukaryota—suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co-inheritance analysis within the domains of life will greatly potentiate the use of the expected onslaught of sequenced genomes in the study of molecular pathways in higher eukaryotes. PMID:26394049
Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.
Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina
2015-01-01
Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.
Energy conservation indicators. 1982 annual report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Belzer, D.B.
A series of Energy Conservation Indicators were developed for the Department of Energy to assist in the evaluation of current and proposed conservation strategies. As descriptive statistics that signify current conditions and trends related to efficiency of energy use, indicators provide a way of measuring, monitoring, or inferring actual responses by consumers in markets for energy services. Related sets of indicators are presented in some 40 one-page indicator summaries. Indicators are shown graphically, followed by several paragraphs that explain their derivation and highlight key findings. Indicators are classified according to broad end-use sectors: Aggregate (economy), Residential, Commercial, Industrial, Transportation andmore » Electric Utilities. In most cases annual time series information is presented covering the period 1960 through 1981.« less
Zhang, Wangshu; Coba, Marcelo P; Sun, Fengzhu
2016-01-11
Protein domains can be viewed as portable units of biological function that defines the functional properties of proteins. Therefore, if a protein is associated with a disease, protein domains might also be associated and define disease endophenotypes. However, knowledge about such domain-disease relationships is rarely available. Thus, identification of domains associated with human diseases would greatly improve our understanding of the mechanism of human complex diseases and further improve the prevention, diagnosis and treatment of these diseases. Based on phenotypic similarities among diseases, we first group diseases into overlapping modules. We then develop a framework to infer associations between domains and diseases through known relationships between diseases and modules, domains and proteins, as well as proteins and disease modules. Different methods including Association, Maximum likelihood estimation (MLE), Domain-disease pair exclusion analysis (DPEA), Bayesian, and Parsimonious explanation (PE) approaches are developed to predict domain-disease associations. We demonstrate the effectiveness of all the five approaches via a series of validation experiments, and show the robustness of the MLE, Bayesian and PE approaches to the involved parameters. We also study the effects of disease modularization in inferring novel domain-disease associations. Through validation, the AUC (Area Under the operating characteristic Curve) scores for Bayesian, MLE, DPEA, PE, and Association approaches are 0.86, 0.84, 0.83, 0.83 and 0.79, respectively, indicating the usefulness of these approaches for predicting domain-disease relationships. Finally, we choose the Bayesian approach to infer domains associated with two common diseases, Crohn's disease and type 2 diabetes. The Bayesian approach has the best performance for the inference of domain-disease relationships. The predicted landscape between domains and diseases provides a more detailed view about the disease mechanisms.
NASA Astrophysics Data System (ADS)
Freed, A. M.; Dickinson, H.; Huang, M. H.; Fielding, E. J.; Burgmann, R.; Andronicos, C.
2015-12-01
The Mw 7.2 El Mayor-Cucapah (EMC) earthquake ruptured a ~120 km long series of faults striking northwest from the Gulf of California to the Sierra Cucapah. Five years after the EMC event, a dense network of GPS stations in southern California and a sparse array of sites installed after the earthquake in northern Mexico measure ongoing surface deformation as coseismic stresses relax. We use 3D finite element models of seismically inferred crustal and mantle structure with earthquake slip constrained by GPS, InSAR range change and SAR and SPOT image sub-pixel offset measurements to infer the rheologic structure of the region. Model complexity, including 3D Moho structure and distinct geologic regions such as the Peninsular Ranges and Salton Trough, enable us to explore vertical and lateral heterogeneities of crustal and mantle rheology. We find that postseismic displacements can be explained by relaxation of a laterally varying, stratified rheologic structure controlled by temperature and crustal thickness. In the Salton Trough region, particularly large postseismic displacements require a relatively weak mantle column that weakens with depth, consistent with a strong but thin (22 km thick) crust and high regional temperatures. In contrast, beneath the neighboring Peninsular Ranges a strong, thick (up to 35 km) crust and cooler temperatures lead to a rheologically stronger mantle column. Thus, we find that the inferred rheologic structure corresponds with observed seismic structure and thermal variations. Significant afterslip is not required to explain postseismic displacements, but cannot be ruled out. Combined with isochemical phase diagrams, our results enable us to go beyond rheologic structure and infer some basic properties about the regional mantle, including composition, water content, and the degree of partial melting.
Quantum cognition based on an ambiguous representation derived from a rough set approximation.
Gunji, Yukio-Pegio; Sonoda, Kohei; Basios, Vasileios
2016-03-01
Over the last years, in a series papers by Arecchi and others, a model for the cognitive processes involved in decision making has been proposed and investigated. The key element of this model is the expression of apprehension and judgment, basic cognitive process of decision making, as an inverse Bayes inference classifying the information content of neuron spike trains. It has been shown that for successive plural stimuli this inference, equipped with basic non-algorithmic jumps, is affected by quantum-like characteristics. We show here that such a decision making process is related consistently with an ambiguous representation by an observer within a universe of discourse. In our work the ambiguous representation of an object or a stimuli is defined as a pair of maps from objects of a set to their representations, where these two maps are interrelated in a particular structure. The a priori and a posteriori hypotheses in Bayes inference are replaced by the upper and lower approximations, correspondingly, for the initial data sets that are derived with respect to each map. Upper and lower approximations herein are defined in the context of "rough set" analysis. The inverse Bayes inference is implemented by the lower approximations with respect to the one map and for the upper approximation with respect to the other map for a given data set. We show further that, due to the particular structural relation between the two maps, the logical structure of such combined approximations can only be expressed as an orthomodular lattice and therefore can be represented by a quantum rather than a Boolean logic. To our knowledge, this is the first investigation aiming to reveal the concrete logic structure of inverse Bayes inference in cognitive processes. Copyright © 2016. Published by Elsevier Ireland Ltd.
The Heuristic Value of p in Inductive Statistical Inference
Krueger, Joachim I.; Heck, Patrick R.
2017-01-01
Many statistical methods yield the probability of the observed data – or data more extreme – under the assumption that a particular hypothesis is true. This probability is commonly known as ‘the’ p-value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p-value has been subjected to much speculation, analysis, and criticism. We explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say. PMID:28649206
NASA Astrophysics Data System (ADS)
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (< 30 m) controlled methane release experiments using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
The Heuristic Value of p in Inductive Statistical Inference.
Krueger, Joachim I; Heck, Patrick R
2017-01-01
Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' p -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p -value has been subjected to much speculation, analysis, and criticism. We explore how well the p -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.
Metacognition and abstract reasoning.
Markovits, Henry; Thompson, Valerie A; Brisson, Janie
2015-05-01
The nature of people's meta-representations of deductive reasoning is critical to understanding how people control their own reasoning processes. We conducted two studies to examine whether people have a metacognitive representation of abstract validity and whether familiarity alone acts as a separate metacognitive cue. In Study 1, participants were asked to make a series of (1) abstract conditional inferences, (2) concrete conditional inferences with premises having many potential alternative antecedents and thus specifically conducive to the production of responses consistent with conditional logic, or (3) concrete problems with premises having relatively few potential alternative antecedents. Participants gave confidence ratings after each inference. Results show that confidence ratings were positively correlated with logical performance on abstract problems and concrete problems with many potential alternatives, but not with concrete problems with content less conducive to normative responses. Confidence ratings were higher with few alternatives than for abstract content. Study 2 used a generation of contrary-to-fact alternatives task to improve levels of abstract logical performance. The resulting increase in logical performance was mirrored by increases in mean confidence ratings. Results provide evidence for a metacognitive representation based on logical validity, and show that familiarity acts as a separate metacognitive cue.
Gene regulatory network inference using fused LASSO on multiple data sets
Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran
2016-01-01
Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687
NASA Astrophysics Data System (ADS)
Varotsos, Costas A.; Efstathiou, Maria N.
2017-05-01
A substantial weakness of several climate studies on long-range dependence is the conclusion of long-term memory of the climate conditions, without considering it necessary to establish the power-law scaling and to reject a simple exponential decay of the autocorrelation function. We herewith show one paradigmatic case, where a strong long-range dependence could be wrongly inferred from incomplete data analysis. We firstly apply the DFA method on the solar and volcanic forcing time series over the tropical Pacific, during the past 1000 years and the results obtained show that a statistically significant straight line fit to the fluctuation function in a log-log representation is revealed with slope higher than 0.5, which wrongly may be assumed as an indication of persistent long-range correlations in the time series. We argue that the long-range dependence cannot be concluded just from this straight line fit, but it requires the fulfilment of the two additional prerequisites i.e. reject the exponential decay of the autocorrelation function and establish the power-law scaling. In fact, the investigation of the validity of these prerequisites showed that the DFA exponent higher than 0.5 does not justify the existence of persistent long-range correlations in the temporal evolution of the solar and volcanic forcing during last millennium. In other words, we show that empirical analyses, based on these two prerequisites must not be considered as panacea for a direct proof of scaling, but only as evidence that the scaling hypothesis is plausible. We also discuss the scaling behaviour of solar and volcanic forcing data based on the Haar tool, which recently proved its ability to reliably detect the existence of the scaling effect in climate series.
Linden, Ariel
2018-05-11
Interrupted time series analysis (ITSA) is an evaluation methodology in which a single treatment unit's outcome is studied serially over time and the intervention is expected to "interrupt" the level and/or trend of that outcome. ITSA is commonly evaluated using methods which may produce biased results if model assumptions are violated. In this paper, treatment effects are alternatively assessed by using forecasting methods to closely fit the preintervention observations and then forecast the post-intervention trend. A treatment effect may be inferred if the actual post-intervention observations diverge from the forecasts by some specified amount. The forecasting approach is demonstrated using the effect of California's Proposition 99 for reducing cigarette sales. Three forecast models are fit to the preintervention series-linear regression (REG), Holt-Winters (HW) non-seasonal smoothing, and autoregressive moving average (ARIMA)-and forecasts are generated into the post-intervention period. The actual observations are then compared with the forecasts to assess intervention effects. The preintervention data were fit best by HW, followed closely by ARIMA. REG fit the data poorly. The actual post-intervention observations were above the forecasts in HW and ARIMA, suggesting no intervention effect, but below the forecasts in the REG (suggesting a treatment effect), thereby raising doubts about any definitive conclusion of a treatment effect. In a single-group ITSA, treatment effects are likely to be biased if the model is misspecified. Therefore, evaluators should consider using forecast models to accurately fit the preintervention data and generate plausible counterfactual forecasts, thereby improving causal inference of treatment effects in single-group ITSA studies. © 2018 John Wiley & Sons, Ltd.
Stepwise inference of likely dynamic flux distributions from metabolic time series data.
Faraji, Mojdeh; Voit, Eberhard O
2017-07-15
Most metabolic pathways contain more reactions than metabolites and therefore have a wide stoichiometric matrix that corresponds to infinitely many possible flux distributions that are perfectly compatible with the dynamics of the metabolites in a given dataset. This under-determinedness poses a challenge for the quantitative characterization of flux distributions from time series data and thus for the design of adequate, predictive models. Here we propose a method that reduces the degrees of freedom in a stepwise manner and leads to a dynamic flux distribution that is, in a statistical sense, likely to be close to the true distribution. We applied the proposed method to the lignin biosynthesis pathway in switchgrass. The system consists of 16 metabolites and 23 enzymatic reactions. It has seven degrees of freedom and therefore admits a large space of dynamic flux distributions that all fit a set of metabolic time series data equally well. The proposed method reduces this space in a systematic and biologically reasonable manner and converges to a likely dynamic flux distribution in just a few iterations. The estimated solution and the true flux distribution, which is known in this case, show excellent agreement and thereby lend support to the method. The computational model was implemented in MATLAB (version R2014a, The MathWorks, Natick, MA). The source code is available at https://github.gatech.edu/VoitLab/Stepwise-Inference-of-Likely-Dynamic-Flux-Distributions and www.bst.bme.gatech.edu/research.php . mojdeh@gatech.edu or eberhard.voit@bme.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Recursive regularization for inferring gene networks from time-course gene expression profiles
Shimamura, Teppei; Imoto, Seiya; Yamaguchi, Rui; Fujita, André; Nagasaki, Masao; Miyano, Satoru
2009-01-01
Background Inferring gene networks from time-course microarray experiments with vector autoregressive (VAR) model is the process of identifying functional associations between genes through multivariate time series. This problem can be cast as a variable selection problem in Statistics. One of the promising methods for variable selection is the elastic net proposed by Zou and Hastie (2005). However, VAR modeling with the elastic net succeeds in increasing the number of true positives while it also results in increasing the number of false positives. Results By incorporating relative importance of the VAR coefficients into the elastic net, we propose a new class of regularization, called recursive elastic net, to increase the capability of the elastic net and estimate gene networks based on the VAR model. The recursive elastic net can reduce the number of false positives gradually by updating the importance. Numerical simulations and comparisons demonstrate that the proposed method succeeds in reducing the number of false positives drastically while keeping the high number of true positives in the network inference and achieves two or more times higher true discovery rate (the proportion of true positives among the selected edges) than the competing methods even when the number of time points is small. We also compared our method with various reverse-engineering algorithms on experimental data of MCF-7 breast cancer cells stimulated with two ErbB ligands, EGF and HRG. Conclusion The recursive elastic net is a powerful tool for inferring gene networks from time-course gene expression profiles. PMID:19386091
Functional imaging with low-resolution brain electromagnetic tomography (LORETA): a review.
Pascual-Marqui, R D; Esslen, M; Kochi, K; Lehmann, D
2002-01-01
This paper reviews several recent publications that have successfully used the functional brain imaging method known as LORETA. Emphasis is placed on the electrophysiological and neuroanatomical basis of the method, on the localization properties of the method, and on the validation of the method in real experimental human data. Papers that criticize LORETA are briefly discussed. LORETA publications in the 1994-1997 period based localization inference on images of raw electric neuronal activity. In 1998, a series of papers appeared that based localization inference on the statistical parametric mapping methodology applied to high-time resolution LORETA images. Starting in 1999, quantitative neuroanatomy was added to the methodology, based on the digitized Talairach atlas provided by the Brain Imaging Centre, Montreal Neurological Institute. The combination of these methodological developments has placed LORETA at a level that compares favorably to the more classical functional imaging methods, such as PET and fMRI.
Wang, Shijun; Liu, Peter; Turkbey, Baris; Choyke, Peter; Pinto, Peter; Summers, Ronald M
2012-01-01
In this paper, we propose a new pharmacokinetic model for parameter estimation of dynamic contrast-enhanced (DCE) MRI by using Gaussian process inference. Our model is based on the Tofts dual-compartment model for the description of tracer kinetics and the observed time series from DCE-MRI is treated as a Gaussian stochastic process. The parameter estimation is done through a maximum likelihood approach and we propose a variant of the coordinate descent method to solve this likelihood maximization problem. The new model was shown to outperform a baseline method on simulated data. Parametric maps generated on prostate DCE data with the new model also provided better enhancement of tumors, lower intensity on false positives, and better boundary delineation when compared with the baseline method. New statistical parameter maps from the process model were also found to be informative, particularly when paired with the PK parameter maps.
Campbell, Kieran R.
2016-01-01
Single cell gene expression profiling can be used to quantify transcriptional dynamics in temporal processes, such as cell differentiation, using computational methods to label each cell with a ‘pseudotime’ where true time series experimentation is too difficult to perform. However, owing to the high variability in gene expression between individual cells, there is an inherent uncertainty in the precise temporal ordering of the cells. Pre-existing methods for pseudotime estimation have predominantly given point estimates precluding a rigorous analysis of the implications of uncertainty. We use probabilistic modelling techniques to quantify pseudotime uncertainty and propagate this into downstream differential expression analysis. We demonstrate that reliance on a point estimate of pseudotime can lead to inflated false discovery rates and that probabilistic approaches provide greater robustness and measures of the temporal resolution that can be obtained from pseudotime inference. PMID:27870852
Geeraert, Nicolas; Yzerbyt, Vincent Y
2007-06-01
Although social observers have been found to rely heavily on dispositions in their causal analysis, it has been proposed that culture strongly affects this tendency. Recent research has shown that suppressing dispositional inferences during social judgment can lead to a dispositional rebound, that is relying more on dispositional information in subsequent judgments. In the present research, we investigated whether culture also affects this rebound tendency. First, Thai and Belgian participants took part in a typical attitude attribution paradigm. Next, dispositional rebound was assessed by having participants describe a series of pictures. The dispositional rebound occurred for both Belgian and Thai participants when confronted with a forced target, but disappeared for Thai participants when the situational constraints of the target were made salient. The findings are discussed in light of the current cultural models of attribution theory.
Cloern, James E.; Jassby, Alan D.; Carstensen, Jacob; Bennett, William A.; Kimmerer, Wim; Mac Nally, Ralph; Schoellhamer, David H.; Winder, Monika
2012-01-01
We comment on a nonstandard statistical treatment of time-series data first published by Breton et al. (2006) in Limnology and Oceanography and, more recently, used by Glibert (2010) in Reviews in Fisheries Science. In both papers, the authors make strong inferences about the underlying causes of population variability based on correlations between cumulative sum (CUSUM) transformations of organism abundances and environmental variables. Breton et al. (2006) reported correlations between CUSUM-transformed values of diatom biomass in Belgian coastal waters and the North Atlantic Oscillation, and between meteorological and hydrological variables. Each correlation of CUSUM-transformed variables was judged to be statistically significant. On the basis of these correlations, Breton et al. (2006) developed "the first evidence of synergy between climate and human-induced river-based nitrate inputs with respect to their effects on the magnitude of spring Phaeocystis colony blooms and their dominance over diatoms."
Field dynamics inference via spectral density estimation
NASA Astrophysics Data System (ADS)
Frank, Philipp; Steininger, Theo; Enßlin, Torsten A.
2017-11-01
Stochastic differential equations are of utmost importance in various scientific and industrial areas. They are the natural description of dynamical processes whose precise equations of motion are either not known or too expensive to solve, e.g., when modeling Brownian motion. In some cases, the equations governing the dynamics of a physical system on macroscopic scales occur to be unknown since they typically cannot be deduced from general principles. In this work, we describe how the underlying laws of a stochastic process can be approximated by the spectral density of the corresponding process. Furthermore, we show how the density can be inferred from possibly very noisy and incomplete measurements of the dynamical field. Generally, inverse problems like these can be tackled with the help of Information Field Theory. For now, we restrict to linear and autonomous processes. To demonstrate its applicability, we employ our reconstruction algorithm on a time-series and spatiotemporal processes.
Field dynamics inference via spectral density estimation.
Frank, Philipp; Steininger, Theo; Enßlin, Torsten A
2017-11-01
Stochastic differential equations are of utmost importance in various scientific and industrial areas. They are the natural description of dynamical processes whose precise equations of motion are either not known or too expensive to solve, e.g., when modeling Brownian motion. In some cases, the equations governing the dynamics of a physical system on macroscopic scales occur to be unknown since they typically cannot be deduced from general principles. In this work, we describe how the underlying laws of a stochastic process can be approximated by the spectral density of the corresponding process. Furthermore, we show how the density can be inferred from possibly very noisy and incomplete measurements of the dynamical field. Generally, inverse problems like these can be tackled with the help of Information Field Theory. For now, we restrict to linear and autonomous processes. To demonstrate its applicability, we employ our reconstruction algorithm on a time-series and spatiotemporal processes.
1981-08-01
RATIO TEST STATISTIC FOR SPHERICITY OF COMPLEX MULTIVARIATE NORMAL DISTRIBUTION* C. Fang P. R. Krishnaiah B. N. Nagarsenker** August 1981 Technical...and their applications in time sEries, the reader is referred to Krishnaiah (1976). Motivated by the applications in the area of inference on multiple...for practical purposes. Here, we note that Krishnaiah , Lee and Chang (1976) approxi- mated the null distribution of certain power of the likeli
Inference of vessel intent and behaviour for maritime security operations
NASA Astrophysics Data System (ADS)
van den Broek, Bert; Smith, Arthur; den Breejen, Eric; van de Voorde, Imelda
2014-10-01
Coastguard and Navy assets are increasingly involved in Maritime Security Operations (MSO) for countering piracy, weapons and drugs smuggling, terrorism and illegal trafficking. Persistent tracking of vessels in interrupted time series over long distances and the modelling of intent and behaviour from multiple data sources are key enablers for Situation Assessment in MSO. Results of situation assessment are presented for AIS/VTS observations in the Dutch North Sea and for simulated scenarios in the Gulf of Oman.
Daily water level forecasting using wavelet decomposition and artificial intelligence techniques
NASA Astrophysics Data System (ADS)
Seo, Youngmin; Kim, Sungwon; Kisi, Ozgur; Singh, Vijay P.
2015-01-01
Reliable water level forecasting for reservoir inflow is essential for reservoir operation. The objective of this paper is to develop and apply two hybrid models for daily water level forecasting and investigate their accuracy. These two hybrid models are wavelet-based artificial neural network (WANN) and wavelet-based adaptive neuro-fuzzy inference system (WANFIS). Wavelet decomposition is employed to decompose an input time series into approximation and detail components. The decomposed time series are used as inputs to artificial neural networks (ANN) and adaptive neuro-fuzzy inference system (ANFIS) for WANN and WANFIS models, respectively. Based on statistical performance indexes, the WANN and WANFIS models are found to produce better efficiency than the ANN and ANFIS models. WANFIS7-sym10 yields the best performance among all other models. It is found that wavelet decomposition improves the accuracy of ANN and ANFIS. This study evaluates the accuracy of the WANN and WANFIS models for different mother wavelets, including Daubechies, Symmlet and Coiflet wavelets. It is found that the model performance is dependent on input sets and mother wavelets, and the wavelet decomposition using mother wavelet, db10, can further improve the efficiency of ANN and ANFIS models. Results obtained from this study indicate that the conjunction of wavelet decomposition and artificial intelligence models can be a useful tool for accurate forecasting daily water level and can yield better efficiency than the conventional forecasting models.
Observations of the marine environment from spaceborne side-looking real aperture radars
NASA Technical Reports Server (NTRS)
Kalmykov, A. I.; Velichko, S. A.; Tsymbal, V. N.; Kuleshov, Yu. A.; Weinman, J. A.; Jurkevich, I.
1993-01-01
Real aperture, side looking X-band radars have been operated from the Soviet Cosmos-1500, -1602, -1766 and Ocean satellites since 1984. Wind velocities were inferred from sea surface radar scattering for speeds ranging from approximately 2 m/s to those of hurricane proportions. The wind speeds were within 10-20 percent of the measured in situ values, and the direction of the wind velocity agreed with in situ direction measurements within 20-50 deg. Various atmospheric mesoscale eddies and tropical cyclones were thus located, and their strengths were inferred from sea surface reflectivity measurements. Rain cells were observed over both land and sea with these spaceborne radars. Algorithms to retrieve rainfall rates from spaceborne radar measurements were also developed. Spaceborne radars have been used to monitor various marine hazards. For example, information derived from those radars was used to plan rescue operations of distressed ships trapped in sea ice. Icebergs have also been monitored, and oil spills were mapped. Tsunamis produced by underwater earthquakes were also observed from space by the radars on the Cosmos 1500 series of satellites. The Cosmos-1500 satellite series have provided all weather radar imagery of the earths surface to a user community in real time by means of a 137.4 MHz Automatic Picture Transmission channel. This feature enabled the radar information to be used in direct support of Soviet polar maritime activities.
Fisher information framework for time series modeling
NASA Astrophysics Data System (ADS)
Venkatesan, R. C.; Plastino, A.
2017-08-01
A robust prediction model invoking the Takens embedding theorem, whose working hypothesis is obtained via an inference procedure based on the minimum Fisher information principle, is presented. The coefficients of the ansatz, central to the working hypothesis satisfy a time independent Schrödinger-like equation in a vector setting. The inference of (i) the probability density function of the coefficients of the working hypothesis and (ii) the establishing of constraint driven pseudo-inverse condition for the modeling phase of the prediction scheme, is made, for the case of normal distributions, with the aid of the quantum mechanical virial theorem. The well-known reciprocity relations and the associated Legendre transform structure for the Fisher information measure (FIM, hereafter)-based model in a vector setting (with least square constraints) are self-consistently derived. These relations are demonstrated to yield an intriguing form of the FIM for the modeling phase, which defines the working hypothesis, solely in terms of the observed data. Cases for prediction employing time series' obtained from the: (i) the Mackey-Glass delay-differential equation, (ii) one ECG signal from the MIT-Beth Israel Deaconess Hospital (MIT-BIH) cardiac arrhythmia database, and (iii) one ECG signal from the Creighton University ventricular tachyarrhythmia database. The ECG samples were obtained from the Physionet online repository. These examples demonstrate the efficiency of the prediction model. Numerical examples for exemplary cases are provided.
Revisiting a possible relationship between solar activity and Earth rotation variability
NASA Astrophysics Data System (ADS)
Abarca del Rio, R.; Gambis, D.
2011-10-01
A variety of studies have searched to establish a possible relationship between the solar activity and earth variations (Danjon, 1958-1962; Challinor, 1971; Currie, 1980, Gambis, 1990). We are revisiting previous studies (Bourget et al, 1992, Abarca del Rio et al, 2003, Marris et al, 2004) concerning the possible relationship between solar activity variability and length of day (LOD) variations at decadal time scales. Assuming that changes in AAM for the entire atmosphere are accompanied by equal, but opposite, changes in the angular momentum of the earth it is possible to infer changes in LOD from global AAM time series, through the relation : delta (LOD) (ms) = 1.68 10^29 delta(AAM) (kgm2/s) (Rosen and Salstein, 1983), where δ(LOD) is given in milliseconds. Given the close relationship at seasonal to interannual time's scales between LOD and the Atmospheric Angular Momentum (AAM) (see Abarca del Rio et al., 2003) it is possible to infer from century long atmospheric simulations what may have been the variability in the associated LOD variability throughout the last century. In the absence of a homogeneous century long LOD time series, we take advantage of the recent atmospheric reanalyzes extending since 1871 (Compo, Whitaker and Sardeshmukh, 2006). The atmospheric data (winds) of these reanalyzes allow computing AAM up to the top of the atmosphere; though here only troposphere data (up to 100 hPa) was taken into account.
Including trait-based early warning signals helps predict population collapse
Clements, Christopher F.; Ozgul, Arpat
2016-01-01
Foreseeing population collapse is an on-going target in ecology, and this has led to the development of early warning signals based on expected changes in leading indicators before a bifurcation. Such signals have been sought for in abundance time-series data on a population of interest, with varying degrees of success. Here we move beyond these established methods by including parallel time-series data of abundance and fitness-related trait dynamics. Using data from a microcosm experiment, we show that including information on the dynamics of phenotypic traits such as body size into composite early warning indices can produce more accurate inferences of whether a population is approaching a critical transition than using abundance time-series alone. By including fitness-related trait information alongside traditional abundance-based early warning signals in a single metric of risk, our generalizable approach provides a powerful new way to assess what populations may be on the verge of collapse. PMID:27009968
NASA Astrophysics Data System (ADS)
Hopcroft, Peter O.; Gallagher, Kerry; Pain, Christopher C.
2009-08-01
Collections of suitably chosen borehole profiles can be used to infer large-scale trends in ground-surface temperature (GST) histories for the past few hundred years. These reconstructions are based on a large database of carefully selected borehole temperature measurements from around the globe. Since non-climatic thermal influences are difficult to identify, representative temperature histories are derived by averaging individual reconstructions to minimize the influence of these perturbing factors. This may lead to three potentially important drawbacks: the net signal of non-climatic factors may not be zero, meaning that the average does not reflect the best estimate of past climate; the averaging over large areas restricts the useful amount of more local climate change information available; and the inversion methods used to reconstruct the past temperatures at each site must be mathematically identical and are therefore not necessarily best suited to all data sets. In this work, we avoid these issues by using a Bayesian partition model (BPM), which is computed using a trans-dimensional form of a Markov chain Monte Carlo algorithm. This then allows the number and spatial distribution of different GST histories to be inferred from a given set of borehole data by partitioning the geographical area into discrete partitions. Profiles that are heavily influenced by non-climatic factors will be partitioned separately. Conversely, profiles with climatic information, which is consistent with neighbouring profiles, will then be inferred to lie in the same partition. The geographical extent of these partitions then leads to information on the regional extent of the climatic signal. In this study, three case studies are described using synthetic and real data. The first demonstrates that the Bayesian partition model method is able to correctly partition a suite of synthetic profiles according to the inferred GST history. In the second, more realistic case, a series of temperature profiles are calculated using surface air temperatures of a global climate model simulation. In the final case, 23 real boreholes from the United Kingdom, previously used for climatic reconstructions, are examined and the results compared with a local instrumental temperature series and the previous estimate derived from the same borehole data. The results indicate that the majority (17) of the 23 boreholes are unsuitable for climatic reconstruction purposes, at least without including other thermal processes in the forward model.
NASA Technical Reports Server (NTRS)
Molnar, Gyula I.; Susskind, Joel; Iredell, Lena
2011-01-01
In the beginning, a good measure of a GMCs performance was their ability to simulate the observed mean seasonal cycle. That is, a reasonable simulation of the means (i.e., small biases) and standard deviations of TODAY?S climate would suffice. Here, we argue that coupled GCM (CG CM for short) simulations of FUTURE climates should be evaluated in much more detail, both spatially and temporally. Arguably, it is not the bias, but rather the reliability of the model-generated anomaly time-series, even down to the [C]GCM grid-scale, which really matter. This statement is underlined by the social need to address potential REGIONAL climate variability, and climate drifts/changes in a manner suitable for policy decisions.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Inferring phase equations from multivariate time series.
Tokuda, Isao T; Jain, Swati; Kiss, István Z; Hudson, John L
2007-08-10
An approach is presented for extracting phase equations from multivariate time series data recorded from a network of weakly coupled limit cycle oscillators. Our aim is to estimate important properties of the phase equations including natural frequencies and interaction functions between the oscillators. Our approach requires the measurement of an experimental observable of the oscillators; in contrast with previous methods it does not require measurements in isolated single or two-oscillator setups. This noninvasive technique can be advantageous in biological systems, where extraction of few oscillators may be a difficult task. The method is most efficient when data are taken from the nonsynchronized regime. Applicability to experimental systems is demonstrated by using a network of electrochemical oscillators; the obtained phase model is utilized to predict the synchronization diagram of the system.
Pollution and Climate Effects on Tree-Ring Nitrogen Isotopes
NASA Astrophysics Data System (ADS)
Savard, M. M.; Bégin, C.; Marion, J.; Smirnoff, A.
2009-04-01
BACKGROUND Monitoring of nitrous oxide concentration only started during the last 30 years in North America, but anthropogenic atmospheric nitrogen has been significantly emitted over the last 150 years. Can geochemical characteristics of tree rings be used to infer past changes in the nitrogen cycle of temperate regions? To address this question we use nitrogen stable isotopes in 125 years-long ring series from beech specimens (Fagus grandifolia) of the Georgian Bay Islands National Park (eastern Ontario), and pine (Pinus strobus) and beech trees of the Arboretum Morgan near Montreal (western Quebec). To evaluate the reliability of the N stable isotopes in wood treated for removal of soluble materials, we tested both tree species from the Montreal area. The reproducibility from tree to tree was excellent for both pine and beech trees, the isotopic trends were strongly concordant, and they were not influenced by the heartwood-sapwood transition zone. The coherence of changes of the isotopic series observed for the two species suggests that their tree-ring N isotopic values can serve as environmental indicator. RESULTS AND INTERPRETATION In Montreal and Georgian Bay, the N isotopes show strong and similar parallel agreement (Gleichlaufigkeit test) with the climatic parameters. So in fact, the short-term isotopic fluctuations correlate directly with summer precipitation and inversely with summer and spring temperature. A long-term decreasing isotope trend in Montreal indicates progressive changes in soil chemistry after 1951. A pedochemical change is also inferred for the Georgian Bay site on the basis of a positive N isotopic trend initiated after 1971. At both sites, the long-term ^15N series correlate with a proxy for NOx emissions (Pearson correlation), and carbon-isotope ring series suggest that the same trees have been impacted by phytotoxic pollutants (Savard et al., 2009a). We propose that the contrasted long-term nitrogen-isotope changes of Montreal and Georgian Bay reflect deposition of NOx emissions from cars and coal-power plants, with higher proportions from coal burning in Georgian Bay (Savard et al., 2009b). This interpretation is conceivable because recent monitoring indicates that coal-power plant NOx emissions play an important role in the annual N budget in Ontario, but they are negligible on the Quebec side. CONCLUSION Interpretations of long tree-ring N isotopic series in terms of effects generated by airborne N-species have been previously advocated. Here we further propose that the contrasted isotopic trends obtained for wood samples from two regions reflect different regional anthropogenic N deposition combined with variations of climatic conditions. This research suggests that nitrogen tree-ring series may record both regional climatic conditions and anthropogenic perturbations of the N cycle. REFERENCES Savard, M.M., Bégin,C., Marion, J., Aznar, J.-C., Smirnoff, A., 2009a. Changes of Air Quality in an urban region as inferred from tree-ring width and stable isotopes. Chapter 9 in "Relating Atmospheric Source Apportionment to Vegetation Effects: Establishing Cause Effect Relationships" (A. Legge ed.). Elsevier, Amsterdam; doi: 10.1016/S1474-8177(08)00209x. Savard, M.M., Bégin, C., Smirnoff, A., Marion, J., Rioux-Paquette, E., 2009b. Tree-ring nitrogen isotopes reflect climatic effects and anthropogenic NOx emissions. Env. Sci. Tech (doi: 10.1021/es802437k).
NASA Astrophysics Data System (ADS)
Wright, Ashley J.; Walker, Jeffrey P.; Pauwels, Valentijn R. N.
2017-08-01
Floods are devastating natural hazards. To provide accurate, precise, and timely flood forecasts, there is a need to understand the uncertainties associated within an entire rainfall time series, even when rainfall was not observed. The estimation of an entire rainfall time series and model parameter distributions from streamflow observations in complex dynamic catchments adds skill to current areal rainfall estimation methods, allows for the uncertainty of entire rainfall input time series to be considered when estimating model parameters, and provides the ability to improve rainfall estimates from poorly gauged catchments. Current methods to estimate entire rainfall time series from streamflow records are unable to adequately invert complex nonlinear hydrologic systems. This study aims to explore the use of wavelets in the estimation of rainfall time series from streamflow records. Using the Discrete Wavelet Transform (DWT) to reduce rainfall dimensionality for the catchment of Warwick, Queensland, Australia, it is shown that model parameter distributions and an entire rainfall time series can be estimated. Including rainfall in the estimation process improves streamflow simulations by a factor of up to 1.78. This is achieved while estimating an entire rainfall time series, inclusive of days when none was observed. It is shown that the choice of wavelet can have a considerable impact on the robustness of the inversion. Combining the use of a likelihood function that considers rainfall and streamflow errors with the use of the DWT as a model data reduction technique allows the joint inference of hydrologic model parameters along with rainfall.
1987-06-01
number of series among the 63 which were identified as a particular ARIMA form and were "best" modeled by a particular technique. Figure 1 illustrates a...th time from xe’s. The integrbted autoregressive - moving average model , denoted by ARIMA (p,d,q) is a result of combining d-th differencing process...Experiments, (4) Data Analysis and Modeling , (5) Theory and Probablistic Inference, (6) Fuzzy Statistics, (7) Forecasting and Prediction, (8) Small Sample
Data Sources for Biosurveillance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walters, Ronald A.; Harlan, MS, Pete A.; Nelson, Noele P.
2010-03-01
Biosurveillance analyzes timely data, often fusing time series of many different types of data, to infer the status of public health rather than solely exploiting data having diagnostic specificity. Integrated biosurveillance requires a synthesis of analytic approaches derived from the natural disaster, public health, medical, meteorological, and social science communities, among others, and it is the cornerstone of early disease detection. This paper summarizes major systems dedicated to such an endeavor and emphasizes system capabilities that if creatively exploited could contribute to creation of an effective global biosurveillance enterprise.
Measuring attitude with a gradiometer
NASA Technical Reports Server (NTRS)
Sonnabend, David; Born, George H.
1994-01-01
Static attitude estimation and dynamic attitude estimation are used to describe a gradiometer composed of a number of accelerometers that are used to measure a combination of the local gravity gradient and instrument rotation effects. After a series of measures to isolate the gradient, a global mesh of measurements can be obtained that determine the planetary external gravity potential. Orbital and spacecraft models are developed to determine if, when the gravity potential is known, the same measurements, unsupported by any other information can be used to infer the spacecraft attitude.
Horizon sensor errors calculated by computer models compared with errors measured in orbit
NASA Technical Reports Server (NTRS)
Ward, K. A.; Hogan, R.; Andary, J.
1982-01-01
Using a computer program to model the earth's horizon and to duplicate the signal processing procedure employed by the ESA (Earth Sensor Assembly), errors due to radiance variation have been computed for a particular time of the year. Errors actually occurring in flight at the same time of year are inferred from integrated rate gyro data for a satellite of the TIROS series of NASA weather satellites (NOAA-A). The predicted performance is compared with actual flight history.
Error analysis of Dobson spectrophotometer measurements of the total ozone content
NASA Technical Reports Server (NTRS)
Holland, A. C.; Thomas, R. W. L.
1975-01-01
A study of techniques for measuring atmospheric ozone is reported. This study represents the second phase of a program designed to improve techniques for the measurement of atmospheric ozone. This phase of the program studied the sensitivity of Dobson direct sun measurements and the ozone amounts inferred from those measurements to variation in the atmospheric temperature profile. The study used the plane - parallel Monte-Carlo model developed and tested under the initial phase of this program, and a series of standard model atmospheres.
Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals
Fourment, Mathieu; Claywell, Brian C; Dinh, Vu; McCoy, Connor; Matsen IV, Frederick A; Darling, Aaron E
2018-01-01
Abstract Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop “guided” proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy. PMID:29186587
Hafemeister, Christoph; Nicotra, Adrienne B.; Jagadish, S.V. Krishna; Bonneau, Richard; Purugganan, Michael
2016-01-01
Environmental gene regulatory influence networks (EGRINs) coordinate the timing and rate of gene expression in response to environmental signals. EGRINs encompass many layers of regulation, which culminate in changes in accumulated transcript levels. Here, we inferred EGRINs for the response of five tropical Asian rice (Oryza sativa) cultivars to high temperatures, water deficit, and agricultural field conditions by systematically integrating time-series transcriptome data, patterns of nucleosome-free chromatin, and the occurrence of known cis-regulatory elements. First, we identified 5447 putative target genes for 445 transcription factors (TFs) by connecting TFs with genes harboring known cis-regulatory motifs in nucleosome-free regions proximal to their transcriptional start sites. We then used network component analysis to estimate the regulatory activity for each TF based on the expression of its putative target genes. Finally, we inferred an EGRIN using the estimated transcription factor activity (TFA) as the regulator. The EGRINs include regulatory interactions between 4052 target genes regulated by 113 TFs. We resolved distinct regulatory roles for members of the heat shock factor family, including a putative regulatory connection between abiotic stress and the circadian clock. TFA estimation using network component analysis is an effective way of incorporating multiple genome-scale measurements into network inference. PMID:27655842
Value-Based Standards Guide Sexism Inferences for Self and Others.
Mitamura, Chelsea; Erickson, Lynnsey; Devine, Patricia G
2017-09-01
People often disagree about what constitutes sexism, and these disagreements can be both socially and legally consequential. It is unclear, however, why or how people come to different conclusions about whether something or someone is sexist. Previous research on judgments about sexism has focused on the perceiver's gender and attitudes, but neither of these variables identifies comparative standards that people use to determine whether any given behavior (or person) is sexist. Extending Devine and colleagues' values framework (Devine, Monteith, Zuwerink, & Elliot, 1991; Plant & Devine, 1998), we argue that, when evaluating others' behavior, perceivers rely on the morally-prescriptive values that guide their own behavior toward women. In a series of 3 studies we demonstrate that (1) people's personal standards for sexism in their own and others' behavior are each related to their values regarding sexism, (2) these values predict how much behavioral evidence people need to infer sexism, and (3) people with stringent, but not lenient, value-based standards get angry and try to regulate a sexist perpetrator's behavior to reduce sexism. Furthermore, these personal values are related to all outcomes in the present work above and beyond other person characteristics previously used to predict sexism inferences. We discuss the implications of differing value-based standards for explaining and reconciling disputes over what constitutes sexist behavior.
PREFACE: ELC International Meeting on Inference, Computation, and Spin Glasses (ICSG2013)
NASA Astrophysics Data System (ADS)
Kabashima, Yoshiyuki; Hukushima, Koji; Inoue, Jun-ichi; Tanaka, Toshiyuki; Watanabe, Osamu
2013-12-01
The close relationship between probability-based inference and statistical mechanics of disordered systems has been noted for some time. This relationship has provided researchers with a theoretical foundation in various fields of information processing for analytical performance evaluation and construction of efficient algorithms based on message-passing or Monte Carlo sampling schemes. The ELC International Meeting on 'Inference, Computation, and Spin Glasses (ICSG2013)', was held in Sapporo 28-30 July 2013. The meeting was organized as a satellite meeting of STATPHYS25 in order to offer a forum where concerned researchers can assemble and exchange information on the latest results and newly established methodologies, and discuss future directions of the interdisciplinary studies between statistical mechanics and information sciences. Financial support from Grant-in-Aid for Scientific Research on Innovative Areas, MEXT, Japan 'Exploring the Limits of Computation (ELC)' is gratefully acknowledged. We are pleased to publish 23 papers contributed by invited speakers of ICSG2013 in this volume of Journal of Physics: Conference Series. We hope that this volume will promote further development of this highly vigorous interdisciplinary field between statistical mechanics and information/computer science. Editors and ICSG2013 Organizing Committee: Koji Hukushima Jun-ichi Inoue (Local Chair of ICSG2013) Yoshiyuki Kabashima (Editor-in-Chief) Toshiyuki Tanaka Osamu Watanabe (General Chair of ICSG2013)
The Role of Working Memory in the Probabilistic Inference of Future Sensory Events.
Cashdollar, Nathan; Ruhnau, Philipp; Weisz, Nathan; Hasson, Uri
2017-05-01
The ability to represent the emerging regularity of sensory information from the external environment has been thought to allow one to probabilistically infer future sensory occurrences and thus optimize behavior. However, the underlying neural implementation of this process is still not comprehensively understood. Through a convergence of behavioral and neurophysiological evidence, we establish that the probabilistic inference of future events is critically linked to people's ability to maintain the recent past in working memory. Magnetoencephalography recordings demonstrated that when visual stimuli occurring over an extended time series had a greater statistical regularity, individuals with higher working-memory capacity (WMC) displayed enhanced slow-wave neural oscillations in the θ frequency band (4-8 Hz.) prior to, but not during stimulus appearance. This prestimulus neural activity was specifically linked to contexts where information could be anticipated and influenced the preferential sensory processing for this visual information after its appearance. A separate behavioral study demonstrated that this process intrinsically emerges during continuous perception and underpins a realistic advantage for efficient behavioral responses. In this way, WMC optimizes the anticipation of higher level semantic concepts expected to occur in the near future. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Time series modeling of live-cell shape dynamics for image-based phenotypic profiling.
Gordonov, Simon; Hwang, Mun Kyung; Wells, Alan; Gertler, Frank B; Lauffenburger, Douglas A; Bathe, Mark
2016-01-01
Live-cell imaging can be used to capture spatio-temporal aspects of cellular responses that are not accessible to fixed-cell imaging. As the use of live-cell imaging continues to increase, new computational procedures are needed to characterize and classify the temporal dynamics of individual cells. For this purpose, here we present the general experimental-computational framework SAPHIRE (Stochastic Annotation of Phenotypic Individual-cell Responses) to characterize phenotypic cellular responses from time series imaging datasets. Hidden Markov modeling is used to infer and annotate morphological state and state-switching properties from image-derived cell shape measurements. Time series modeling is performed on each cell individually, making the approach broadly useful for analyzing asynchronous cell populations. Two-color fluorescent cells simultaneously expressing actin and nuclear reporters enabled us to profile temporal changes in cell shape following pharmacological inhibition of cytoskeleton-regulatory signaling pathways. Results are compared with existing approaches conventionally applied to fixed-cell imaging datasets, and indicate that time series modeling captures heterogeneous dynamic cellular responses that can improve drug classification and offer additional important insight into mechanisms of drug action. The software is available at http://saphire-hcs.org.
The coupling analysis between stock market indices based on permutation measures
NASA Astrophysics Data System (ADS)
Shi, Wenbin; Shang, Pengjian; Xia, Jianan; Yeh, Chien-Hung
2016-04-01
Many information-theoretic methods have been proposed for analyzing the coupling dependence between time series. And it is significant to quantify the correlation relationship between financial sequences since the financial market is a complex evolved dynamic system. Recently, we developed a new permutation-based entropy, called cross-permutation entropy (CPE), to detect the coupling structures between two synchronous time series. In this paper, we extend the CPE method to weighted cross-permutation entropy (WCPE), to address some of CPE's limitations, mainly its inability to differentiate between distinct patterns of a certain motif and the sensitivity of patterns close to the noise floor. It shows more stable and reliable results than CPE does when applied it to spiky data and AR(1) processes. Besides, we adapt the CPE method to infer the complexity of short-length time series by freely changing the time delay, and test it with Gaussian random series and random walks. The modified method shows the advantages in reducing deviations of entropy estimation compared with the conventional one. Finally, the weighted cross-permutation entropy of eight important stock indices from the world financial markets is investigated, and some useful and interesting empirical results are obtained.
Hybrid wavelet-support vector machine approach for modelling rainfall-runoff process.
Komasi, Mehdi; Sharghi, Soroush
2016-01-01
Because of the importance of water resources management, the need for accurate modeling of the rainfall-runoff process has rapidly grown in the past decades. Recently, the support vector machine (SVM) approach has been used by hydrologists for rainfall-runoff modeling and the other fields of hydrology. Similar to the other artificial intelligence models, such as artificial neural network (ANN) and adaptive neural fuzzy inference system, the SVM model is based on the autoregressive properties. In this paper, the wavelet analysis was linked to the SVM model concept for modeling the rainfall-runoff process of Aghchai and Eel River watersheds. In this way, the main time series of two variables, rainfall and runoff, were decomposed to multiple frequent time series by wavelet theory; then, these time series were imposed as input data on the SVM model in order to predict the runoff discharge one day ahead. The obtained results show that the wavelet SVM model can predict both short- and long-term runoff discharges by considering the seasonality effects. Also, the proposed hybrid model is relatively more appropriate than classical autoregressive ones such as ANN and SVM because it uses the multi-scale time series of rainfall and runoff data in the modeling process.
Across language families: Genome diversity mirrors linguistic variation within Europe
Longobardi, Giuseppe; Ghirotto, Silvia; Guardiano, Cristina; Tassi, Francesca; Benazzo, Andrea; Ceolin, Andrea
2015-01-01
ABSTRACT Objectives: The notion that patterns of linguistic and biological variation may cast light on each other and on population histories dates back to Darwin's times; yet, turning this intuition into a proper research program has met with serious methodological difficulties, especially affecting language comparisons. This article takes advantage of two new tools of comparative linguistics: a refined list of Indo‐European cognate words, and a novel method of language comparison estimating linguistic diversity from a universal inventory of grammatical polymorphisms, and hence enabling comparison even across different families. We corroborated the method and used it to compare patterns of linguistic and genomic variation in Europe. Materials and Methods: Two sets of linguistic distances, lexical and syntactic, were inferred from these data and compared with measures of geographic and genomic distance through a series of matrix correlation tests. Linguistic and genomic trees were also estimated and compared. A method (Treemix) was used to infer migration episodes after the main population splits. Results: We observed significant correlations between genomic and linguistic diversity, the latter inferred from data on both Indo‐European and non‐Indo‐European languages. Contrary to previous observations, on the European scale, language proved a better predictor of genomic differences than geography. Inferred episodes of genetic admixture following the main population splits found convincing correlates also in the linguistic realm. Discussion: These results pave the ground for previously unfeasible cross‐disciplinary analyses at the worldwide scale, encompassing populations of distant language families. Am J Phys Anthropol 157:630–640, 2015. © 2015 Wiley Periodicals, Inc. PMID:26059462
NASA Astrophysics Data System (ADS)
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Measured time-series of both precipitation and runoff are known to exhibit highly non-trivial statistical properties. For making reliable probabilistic predictions in hydrology, it is therefore desirable to have stochastic models with output distributions that share these properties. When parameters of such models have to be inferred from data, we also need to quantify the associated parametric uncertainty. For non-trivial stochastic models, however, this latter step is typically very demanding, both conceptually and numerically, and always never done in hydrology. Here, we demonstrate that methods developed in statistical physics make a large class of stochastic differential equation (SDE) models amenable to a full-fledged Bayesian parameter inference. For concreteness we demonstrate these methods by means of a simple yet non-trivial toy SDE model. We consider a natural catchment that can be described by a linear reservoir, at the scale of observation. All the neglected processes are assumed to happen at much shorter time-scales and are therefore modeled with a Gaussian white noise term, the standard deviation of which is assumed to scale linearly with the system state (water volume in the catchment). Even for constant input, the outputs of this simple non-linear SDE model show a wealth of desirable statistical properties, such as fat-tailed distributions and long-range correlations. Standard algorithms for Bayesian inference fail, for models of this kind, because their likelihood functions are extremely high-dimensional intractable integrals over all possible model realizations. The use of Kalman filters is illegitimate due to the non-linearity of the model. Particle filters could be used but become increasingly inefficient with growing number of data points. Hamiltonian Monte Carlo algorithms allow us to translate this inference problem to the problem of simulating the dynamics of a statistical mechanics system and give us access to most sophisticated methods that have been developed in the statistical physics community over the last few decades. We demonstrate that such methods, along with automated differentiation algorithms, allow us to perform a full-fledged Bayesian inference, for a large class of SDE models, in a highly efficient and largely automatized manner. Furthermore, our algorithm is highly parallelizable. For our toy model, discretized with a few hundred points, a full Bayesian inference can be performed in a matter of seconds on a standard PC.
Jang, Gijeong; Yoon, Shin-ae; Lee, Sung-Eun; Park, Haeil; Kim, Joohan; Ko, Jeong Hoon; Park, Hae-Jeong
2013-11-01
In ordinary conversations, literal meanings of an utterance are often quite different from implicated meanings and the inference about implicated meanings is essentially required for successful comprehension of the speaker's utterances. Inference of finding implicated meanings is based on the listener's assumption that the conversational partner says only relevant matters according to the maxim of relevance in Grice's theory of conversational implicature. To investigate the neural correlates of comprehending implicated meanings under the maxim of relevance, a total of 23 participants underwent an fMRI task with a series of conversational pairs, each consisting of a question and an answer. The experimental paradigm was composed of three conditions: explicit answers, moderately implicit answers, and highly implicit answers. Participants were asked to decide whether the answer to the Yes/No question meant 'Yes' or 'No'. Longer reaction time was required for the highly implicit answers than for the moderately implicit answers without affecting the accuracy. The fMRI results show that the left anterior temporal lobe, left angular gyrus, and left posterior middle temporal gyrus had stronger activation in both moderately and highly implicit conditions than in the explicit condition. Comprehension of highly implicit answers had increased activations in additional regions including the left inferior frontal gyrus, left medial prefrontal cortex, left posterior cingulate cortex and right anterior temporal lobe. The activation results indicate involvement of these regions in the inference process to build coherence between literally irrelevant but pragmatically associated utterances under the maxim of relevance. Especially, the left anterior temporal lobe showed high sensitivity to the level of implicitness and showed increased activation for highly versus moderately implicit conditions, which imply its central role in inference such as semantic integration. The right hemisphere activation, uniquely found in the anterior temporal lobe for highly implicit utterances, suggests its competence for integrating distant concepts in implied utterances under the relevance principle. Copyright © 2013 Elsevier Inc. All rights reserved.
Hasegawa, Takanori; Yamaguchi, Rui; Nagasaki, Masao; Miyano, Satoru; Imoto, Seiya
2014-01-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in the field of systems biology. Currently, there are two main approaches in GRN analysis using time-course observation data, namely an ordinary differential equation (ODE)-based approach and a statistical model-based approach. The ODE-based approach can generate complex dynamics of GRNs according to biologically validated nonlinear models. However, it cannot be applied to ten or more genes to simultaneously estimate system dynamics and regulatory relationships due to the computational difficulties. The statistical model-based approach uses highly abstract models to simply describe biological systems and to infer relationships among several hundreds of genes from the data. However, the high abstraction generates false regulations that are not permitted biologically. Thus, when dealing with several tens of genes of which the relationships are partially known, a method that can infer regulatory relationships based on a model with low abstraction and that can emulate the dynamics of ODE-based models while incorporating prior knowledge is urgently required. To accomplish this, we propose a method for inference of GRNs using a state space representation of a vector auto-regressive (VAR) model with L1 regularization. This method can estimate the dynamic behavior of genes based on linear time-series modeling constructed from an ODE-based model and can infer the regulatory structure among several tens of genes maximizing prediction ability for the observational data. Furthermore, the method is capable of incorporating various types of existing biological knowledge, e.g., drug kinetics and literature-recorded pathways. The effectiveness of the proposed method is shown through a comparison of simulation studies with several previous methods. For an application example, we evaluated mRNA expression profiles over time upon corticosteroid stimulation in rats, thus incorporating corticosteroid kinetics/dynamics, literature-recorded pathways and transcription factor (TF) information. PMID:25162401
Does climate have heavy tails?
NASA Astrophysics Data System (ADS)
Bermejo, Miguel; Mudelsee, Manfred
2013-04-01
When we speak about a distribution with heavy tails, we are referring to the probability of the existence of extreme values will be relatively large. Several heavy-tail models are constructed from Poisson processes, which are the most tractable models. Among such processes, one of the most important are the Lévy processes, which are those process with independent, stationary increments and stochastic continuity. If the random component of a climate process that generates the data exhibits a heavy-tail distribution, and if that fact is ignored by assuming a finite-variance distribution, then there would be serious consequences (in the form, e.g., of bias) for the analysis of extreme values. Yet, it appears that it is an open question to what extent and degree climate data exhibit heavy-tail phenomena. We present a study about the statistical inference in the presence of heavy-tail distribution. In particular, we explore (1) the estimation of tail index of the marginal distribution using several estimation techniques (e.g., Hill estimator, Pickands estimator) and (2) the power of hypothesis tests. The performance of the different methods are compared using artificial time-series by means of Monte Carlo experiments. We systematically apply the heavy tail inference to observed climate data, in particular we focus on time series data. We study several proxy and directly observed climate variables from the instrumental period, the Holocene and the Pleistocene. This work receives financial support from the European Commission (Marie Curie Initial Training Network LINC, No. 289447, within the 7th Framework Programme).
NASA Astrophysics Data System (ADS)
Jajcay, N.; Kravtsov, S.; Tsonis, A.; Palus, M.
2017-12-01
A better understanding of dynamics in complex systems, such as the Earth's climate is one of the key challenges for contemporary science and society. A large amount of experimental data requires new mathematical and computational approaches. Natural complex systems vary on many temporal and spatial scales, often exhibiting recurring patterns and quasi-oscillatory phenomena. The statistical inference of causal interactions and synchronization between dynamical phenomena evolving on different temporal scales is of vital importance for better understanding of underlying mechanisms and a key for modeling and prediction of such systems. This study introduces and applies information theory diagnostics to phase and amplitude time series of different wavelet components of the observed data that characterizes El Niño. A suite of significant interactions between processes operating on different time scales was detected, and intermittent synchronization among different time scales has been associated with the extreme El Niño events. The mechanisms of these nonlinear interactions were further studied in conceptual low-order and state-of-the-art dynamical, as well as statistical climate models. Observed and simulated interactions exhibit substantial discrepancies, whose understanding may be the key to an improved prediction. Moreover, the statistical framework which we apply here is suitable for direct usage of inferring cross-scale interactions in nonlinear time series from complex systems such as the terrestrial magnetosphere, solar-terrestrial interactions, seismic activity or even human brain dynamics.
Shape Distributions of Nonlinear Dynamical Systems for Video-Based Inference.
Venkataraman, Vinay; Turaga, Pavan
2016-12-01
This paper presents a shape-theoretic framework for dynamical analysis of nonlinear dynamical systems which appear frequently in several video-based inference tasks. Traditional approaches to dynamical modeling have included linear and nonlinear methods with their respective drawbacks. A novel approach we propose is the use of descriptors of the shape of the dynamical attractor as a feature representation of nature of dynamics. The proposed framework has two main advantages over traditional approaches: a) representation of the dynamical system is derived directly from the observational data, without any inherent assumptions, and b) the proposed features show stability under different time-series lengths where traditional dynamical invariants fail. We illustrate our idea using nonlinear dynamical models such as Lorenz and Rossler systems, where our feature representations (shape distribution) support our hypothesis that the local shape of the reconstructed phase space can be used as a discriminative feature. Our experimental analyses on these models also indicate that the proposed framework show stability for different time-series lengths, which is useful when the available number of samples are small/variable. The specific applications of interest in this paper are: 1) activity recognition using motion capture and RGBD sensors, 2) activity quality assessment for applications in stroke rehabilitation, and 3) dynamical scene classification. We provide experimental validation through action and gesture recognition experiments on motion capture and Kinect datasets. In all these scenarios, we show experimental evidence of the favorable properties of the proposed representation.
Towards a global historical biogeography of Palms
NASA Astrophysics Data System (ADS)
Couvreur, Thomas; Baker, William J.; Frigerio, Jean-Marc; Sepulchre, Pierre; Franc, Alain
2017-04-01
Four mechanisms are at work for deciphering historical biogeography of plants : speciation, extinction, migration, and drift (a sort of neutral speciation). The first three mechanisms are under selection pressure of the environment, mainly the climate and connectivity of land masses. Hence, an accurate history of climate and connectivity or non connectivity between landmasses, as well as orogenesis processes, can shed new light on the most likely speciation events and migration routes driven by paleogeography and paleoclimatology. Currently, some models exist (like DIVA) to infer the most parsimonious history (in the number of migration events) knowing the speciation history given by phylogenies (extinction are mostly unknown), in a given setting of climate and landmass connectivity. In a previous project, we have built in collaboration with LSCE a series of paleogeographic and paleoclimatic maps since the Early Cretaceous. We have developed a program, called Aran, which enables to extend DIVA to a time series of varying paleoclimatic and paleogeogarphic conditions. We apply these new methods and data to unravel the biogeographic history of palms (Arecaceae), a pantropical family of 182 genera and >2600 species whose divergence is dated in Late Cretaceous (100 My). Based on a robust dated molecular phylogeny, novel paleoclimatic and paleogeographic maps, we will generate an updated biogeographic history of Arecaceae inferred from the most parsimonious history using Aran. We will discuss the results, and put them in context with what is known and needed to provide a global biogeographic history of tropical palms.
Kumaraswamy autoregressive moving average models for double bounded environmental data
NASA Astrophysics Data System (ADS)
Bayer, Fábio Mariano; Bayer, Débora Missio; Pumi, Guilherme
2017-12-01
In this paper we introduce the Kumaraswamy autoregressive moving average models (KARMA), which is a dynamic class of models for time series taking values in the double bounded interval (a,b) following the Kumaraswamy distribution. The Kumaraswamy family of distribution is widely applied in many areas, especially hydrology and related fields. Classical examples are time series representing rates and proportions observed over time. In the proposed KARMA model, the median is modeled by a dynamic structure containing autoregressive and moving average terms, time-varying regressors, unknown parameters and a link function. We introduce the new class of models and discuss conditional maximum likelihood estimation, hypothesis testing inference, diagnostic analysis and forecasting. In particular, we provide closed-form expressions for the conditional score vector and conditional Fisher information matrix. An application to environmental real data is presented and discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, J.; Feng, W., E-mail: fengwen69@sina.cn
Extended time series of Solar Activity Indices (ESAI) extended the Greenwich series of sunspot area from the year 1874 back to 1821. The ESAI's yearly sunspot area in the northern and southern hemispheres from 1821 to 2013 is utilized to investigate characteristics of the north–south hemispherical asymmetry of sunspot activity. Periodical behavior of about 12 solar cycles is also confirmed from the ESAI data set to exist in dominant hemispheres, linear regression lines of yearly asymmetry values, and cumulative counts of yearly sunspot areas in the hemispheres for solar cycles. The period is also inferred to appear in both themore » cumulative difference in the yearly sunspot areas in the hemispheres over the entire time interval and in its statistical Student's t-test. The hemispherical bias of sunspot activity should be regarded as an impossible stochastic phenomenon over a long time period.« less
NASA Astrophysics Data System (ADS)
Alshawaf, Fadwa; Dick, Galina; Heise, Stefan; Balidakis, Kyriakos; Schmidt, Torsten; Wickert, Jens
2017-04-01
Ground-based GNSS (Global Navigation Satellite Systems) have efficiently been used since the 1990s as a meteorological observing system. Recently scientists used GNSS time series of precipitable water vapor (PWV) for climate research although they may not be sufficiently long. In this work, we compare the trend estimated from GNSS time series with that estimated from European Center for Medium-RangeWeather Forecasts Reanalysis (ERA-Interim) data and meteorological measurements.We aim at evaluating climate evolution in Central Europe by monitoring different atmospheric variables such as temperature and PWV. PWV time series were obtained by three methods: 1) estimated from ground-based GNSS observations using the method of precise point positioning, 2) inferred from ERA-Interim data, and 3) determined based on daily surface measurements of temperature and relative humidity. The other variables are available from surface meteorological stations or received from ERA-Interim. The PWV trend component estimated from GNSS data strongly correlates (>70%) with that estimated from the other data sets. The linear trend is estimated by straight line fitting over 30 years of seasonally-adjusted PWV time series obtained using the meteorological measurements. The results show a positive trend in the PWV time series with an increase of 0.2-0.7 mm/decade with a mean standard deviations of 0.016 mm/decade. In this paper, we present the results at three GNSS stations. The temporal increment of the PWV correlates with the temporal increase in the temperature levels.
A Combined Length-of-Day Series Spanning 1832-1997
NASA Technical Reports Server (NTRS)
Gross, Richard S.
1999-01-01
The Earth's rotation is not constant but exhibits minute changes on all observable time scales ranging from subdaily to secular. This rich spectrum of observed Earth rotation changes reflects the rich variety of astronomical and geophysical phenomena that are causing the Earth's rotation to change, including, but not limited to, ocean and solid body tides, atmospheric wind and pressure changes, oceanic current and sea level height changes, post-glacial rebound, and torques acting at the core-mantle boundary. In particular, the decadal-scale variations of the Earth's rotation are thought to be largely caused by interactions between the Earth's outer core and mantle. Comparing the inferred Earth rotation variations caused by the various core-mantle interactions to observed variations requires Earth rotation observations spanning decades, if not centuries. During the past century many different techniques have been used to observe the Earth's rotation. By combining the individual Earth rotation series determined by each of these techniques, a series of the Earth's rotation can be obtained that is based upon independent measurements spanning the greatest possible time interval. In this study, independent observations of the Earth's rotation are combined to generate a length-of-day series spanning 1832-1997. The observations combined include lunar occultation measurements spanning 1832-1955, optical astrometric measurements spanning 1956-1982, lunar laser ranging measurements spanning 1970-1997, and very long baseline interferometric measurements spanning 1978-1998. These series are combined using a Kalman filter developed at JPL for just this purpose. The resulting combined length-of-day series will be presented and compared with other available length-of-day series of similar duration.
[Social exchange and inference: an experimental study with the Wason selection task].
Hayashi, N
2001-04-01
Social contract theory (Cosmides, 1989) posits that the human mind was equipped with inference faculty specialized for cheater detection. Cosmides (1989) conducted a series of experiments employing the Wason selection task to demonstrate that her social contract theory could account for the content effects reported in the literature. The purpose of this study was to investigate the possibility that the results were due to experimental artifacts. In the current experiment, the subject was given two versions of the Wason task that contained no social exchange context, but included an instruction implying him/her to look for something, together with the cassava root and the abstract versions used by Cosmides (1989). Results showed that the two versions with no social exchange context produced the same response pattern observed in the original study. It may be concluded that the subject's perception of the rule as a social contract was not necessary to obtain the original results, and that an instruction implying that he/she should look for something was sufficient.
Bayesian inference for dynamic transcriptional regulation; the Hes1 system as a case study.
Heron, Elizabeth A; Finkenstädt, Bärbel; Rand, David A
2007-10-01
In this study, we address the problem of estimating the parameters of regulatory networks and provide the first application of Markov chain Monte Carlo (MCMC) methods to experimental data. As a case study, we consider a stochastic model of the Hes1 system expressed in terms of stochastic differential equations (SDEs) to which rigorous likelihood methods of inference can be applied. When fitting continuous-time stochastic models to discretely observed time series the lengths of the sampling intervals are important, and much of our study addresses the problem when the data are sparse. We estimate the parameters of an autoregulatory network providing results both for simulated and real experimental data from the Hes1 system. We develop an estimation algorithm using MCMC techniques which are flexible enough to allow for the imputation of latent data on a finer time scale and the presence of prior information about parameters which may be informed from other experiments as well as additional measurement error.
Noise-induced relations between network connectivity and dynamics
NASA Astrophysics Data System (ADS)
Ching, Emily Sc
Many biological systems of interest can be represented as networks of many nodes that are interacting with one another. Often these systems are subject to external influence or noise. One of the central issues is to understand the relation between dynamics and the interaction pattern of the system or the connectivity structure of the network. In particular, a challenging problem is to infer the network connectivity structure from the dynamics. In this talk, we show that for stochastic dynamical systems subjected to noise, the presence of noise gives rise to mathematical relations between the network connectivity structure and quantities that can be calculated using solely the time-series measurements of the dynamics of the nodes. We present these relations for both undirected networks with bidirectional coupling and directed networks with directional coupling and discuss how such relations can be utilized to infer the network connectivity structure of the systems. Work supported by the Hong Kong Research Grants Council under Grant No. CUHK 14300914.
NASA Astrophysics Data System (ADS)
Rings, Thorsten; Lehnertz, Klaus
2016-09-01
We investigate the relative merit of phase-based methods for inferring directional couplings in complex networks of weakly interacting dynamical systems from multivariate time-series data. We compare the evolution map approach and its partialized extension to each other with respect to their ability to correctly infer the network topology in the presence of indirect directional couplings for various simulated experimental situations using coupled model systems. In addition, we investigate whether the partialized approach allows for additional or complementary indications of directional interactions in evolving epileptic brain networks using intracranial electroencephalographic recordings from an epilepsy patient. For such networks, both direct and indirect directional couplings can be expected, given the brain's connection structure and effects that may arise from limitations inherent to the recording technique. Our findings indicate that particularly in larger networks (number of nodes ≫10 ), the partialized approach does not provide information about directional couplings extending the information gained with the evolution map approach.
Brunamonti, Emiliano; Costanzo, Floriana; Mammì, Anna; Rufini, Cristina; Veneziani, Diletta; Pani, Pierpaolo; Vicari, Stefano; Ferraina, Stefano; Menghini, Deny
2017-02-01
Here we explored whether children with ADHD have a deficit in relational reasoning, a skill subtending the acquisition of many cognitive abilities and social rules. We analyzed the performance of a group of children with ADHD during a transitive inference task, a task requiring first to learn the reciprocal relationship between adjacent items of a rank ordered series (e.g., A>B; B>C; C>D; D>E; E>F), and second, to deduct the relationship between novel pairs of items never matched during the learning (e.g., B>D; C>E). As a main result, we observed that children with ADHD were impaired in performing inferential reasoning problems. The deficit in relational reasoning was found to be related to the difficulty in managing a unified representation of ordered items. The present finding documented a novel deficit in ADHD, contributing to improving the understanding of the disorder. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Almog, Assaf; Garlaschelli, Diego
2014-09-01
The dynamics of complex systems, from financial markets to the brain, can be monitored in terms of multiple time series of activity of the constituent units, such as stocks or neurons, respectively. While the main focus of time series analysis is on the magnitude of temporal increments, a significant piece of information is encoded into the binary projection (i.e. the sign) of such increments. In this paper we provide further evidence of this by showing strong nonlinear relations between binary and non-binary properties of financial time series. These relations are a novel quantification of the fact that extreme price increments occur more often when most stocks move in the same direction. We then introduce an information-theoretic approach to the analysis of the binary signature of single and multiple time series. Through the definition of maximum-entropy ensembles of binary matrices and their mapping to spin models in statistical physics, we quantify the information encoded into the simplest binary properties of real time series and identify the most informative property given a set of measurements. Our formalism is able to accurately replicate, and mathematically characterize, the observed binary/non-binary relations. We also obtain a phase diagram allowing us to identify, based only on the instantaneous aggregate return of a set of multiple time series, a regime where the so-called ‘market mode’ has an optimal interpretation in terms of collective (endogenous) effects, a regime where it is parsimoniously explained by pure noise, and a regime where it can be regarded as a combination of endogenous and exogenous factors. Our approach allows us to connect spin models, simple stochastic processes, and ensembles of time series inferred from partial information.
Lutaif, N.A.; Palazzo, R.; Gontijo, J.A.R.
2014-01-01
Maintenance of thermal homeostasis in rats fed a high-fat diet (HFD) is associated with changes in their thermal balance. The thermodynamic relationship between heat dissipation and energy storage is altered by the ingestion of high-energy diet content. Observation of thermal registers of core temperature behavior, in humans and rodents, permits identification of some characteristics of time series, such as autoreference and stationarity that fit adequately to a stochastic analysis. To identify this change, we used, for the first time, a stochastic autoregressive model, the concepts of which match those associated with physiological systems involved and applied in male HFD rats compared with their appropriate standard food intake age-matched male controls (n=7 per group). By analyzing a recorded temperature time series, we were able to identify when thermal homeostasis would be affected by a new diet. The autoregressive time series model (AR model) was used to predict the occurrence of thermal homeostasis, and this model proved to be very effective in distinguishing such a physiological disorder. Thus, we infer from the results of our study that maximum entropy distribution as a means for stochastic characterization of temperature time series registers may be established as an important and early tool to aid in the diagnosis and prevention of metabolic diseases due to their ability to detect small variations in thermal profile. PMID:24519093
Lutaif, N A; Palazzo, R; Gontijo, J A R
2014-01-01
Maintenance of thermal homeostasis in rats fed a high-fat diet (HFD) is associated with changes in their thermal balance. The thermodynamic relationship between heat dissipation and energy storage is altered by the ingestion of high-energy diet content. Observation of thermal registers of core temperature behavior, in humans and rodents, permits identification of some characteristics of time series, such as autoreference and stationarity that fit adequately to a stochastic analysis. To identify this change, we used, for the first time, a stochastic autoregressive model, the concepts of which match those associated with physiological systems involved and applied in male HFD rats compared with their appropriate standard food intake age-matched male controls (n=7 per group). By analyzing a recorded temperature time series, we were able to identify when thermal homeostasis would be affected by a new diet. The autoregressive time series model (AR model) was used to predict the occurrence of thermal homeostasis, and this model proved to be very effective in distinguishing such a physiological disorder. Thus, we infer from the results of our study that maximum entropy distribution as a means for stochastic characterization of temperature time series registers may be established as an important and early tool to aid in the diagnosis and prevention of metabolic diseases due to their ability to detect small variations in thermal profile.
Bayesian Analysis of Non-Gaussian Long-Range Dependent Processes
NASA Astrophysics Data System (ADS)
Graves, Timothy; Watkins, Nicholas; Franzke, Christian; Gramacy, Robert
2013-04-01
Recent studies [e.g. the Antarctic study of Franzke, J. Climate, 2010] have strongly suggested that surface temperatures exhibit long-range dependence (LRD). The presence of LRD would hamper the identification of deterministic trends and the quantification of their significance. It is well established that LRD processes exhibit stochastic trends over rather long periods of time. Thus, accurate methods for discriminating between physical processes that possess long memory and those that do not are an important adjunct to climate modeling. As we briefly review, the LRD idea originated at the same time as H-selfsimilarity, so it is often not realised that a model does not have to be H-self similar to show LRD [e.g. Watkins, GRL Frontiers, 2013]. We have used Markov Chain Monte Carlo algorithms to perform a Bayesian analysis of Auto-Regressive Fractionally-Integrated Moving-Average ARFIMA(p,d,q) processes, which are capable of modeling LRD. Our principal aim is to obtain inference about the long memory parameter, d, with secondary interest in the scale and location parameters. We have developed a reversible-jump method enabling us to integrate over different model forms for the short memory component. We initially assume Gaussianity, and have tested the method on both synthetic and physical time series. Many physical processes, for example the Faraday Antarctic time series, are significantly non-Gaussian. We have therefore extended this work by weakening the Gaussianity assumption, assuming an alpha-stable distribution for the innovations, and performing joint inference on d and alpha. Such a modified FARIMA(p,d,q) process is a flexible, initial model for non-Gaussian processes with long memory. We will present a study of the dependence of the posterior variance of the memory parameter d on the length of the time series considered. This will be compared with equivalent error diagnostics for other measures of d.
Determination of Orbital Parameters for Visual Binary Stars Using a Fourier-Series Approach
NASA Astrophysics Data System (ADS)
Brown, D. E.; Prager, J. R.; DeLeo, G. G.; McCluskey, G. E., Jr.
2001-12-01
We expand on the Fourier transform method of Monet (ApJ 234, 275, 1979) to infer the orbital parameters of visual binary stars, and we present results for several systems, both simulated and real. Although originally developed to address binary systems observed through at least one complete period, we have extended the method to deal explicitly with cases where the orbital data is less complete. This is especially useful in cases where the period is so long that only a fragment of the orbit has been recorded. We utilize Fourier-series fitting methods appropriate to data sets covering less than one period and containing random measurement errors. In so doing, we address issues of over-determination in fitting the data and the reduction of other deleterious Fourier-series artifacts. We developed our algorithm using the MAPLE mathematical software code, and tested it on numerous "synthetic" systems, and several real binaries, including Xi Boo, 24 Aqr, and Bu 738. This work was supported at Lehigh University by the Delaware Valley Space Grant Consortium and by NSF-REU grant PHY-9820301.
Disentangling the stochastic behavior of complex time series
NASA Astrophysics Data System (ADS)
Anvari, Mehrnaz; Tabar, M. Reza Rahimi; Peinke, Joachim; Lehnertz, Klaus
2016-10-01
Complex systems involving a large number of degrees of freedom, generally exhibit non-stationary dynamics, which can result in either continuous or discontinuous sample paths of the corresponding time series. The latter sample paths may be caused by discontinuous events - or jumps - with some distributed amplitudes, and disentangling effects caused by such jumps from effects caused by normal diffusion processes is a main problem for a detailed understanding of stochastic dynamics of complex systems. Here we introduce a non-parametric method to address this general problem. By means of a stochastic dynamical jump-diffusion modelling, we separate deterministic drift terms from different stochastic behaviors, namely diffusive and jumpy ones, and show that all of the unknown functions and coefficients of this modelling can be derived directly from measured time series. We demonstrate appli- cability of our method to empirical observations by a data-driven inference of the deterministic drift term and of the diffusive and jumpy behavior in brain dynamics from ten epilepsy patients. Particularly these different stochastic behaviors provide extra information that can be regarded valuable for diagnostic purposes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McLoughlin, K.
2016-01-22
The software application “MetaQuant” was developed by our group at Lawrence Livermore National Laboratory (LLNL). It is designed to profile microbial populations in a sample using data from whole-genome shotgun (WGS) metagenomic DNA sequencing. Several other metagenomic profiling applications have been described in the literature. We ran a series of benchmark tests to compare the performance of MetaQuant against that of a few existing profiling tools, using real and simulated sequence datasets. This report describes our benchmarking procedure and results.
Atomic hydrogen and diatomic titanium-monoxide molecular spectroscopy in laser-induced plasma
NASA Astrophysics Data System (ADS)
Parigger, Christian G.; Woods, Alexander C.
2017-03-01
This article gives a brief review of experimental studies of hydrogen Balmer series emission spectra. Ongoing research aims to evaluate early plasma evolution following optical breakdown in laboratory air. Of interest is as well laser ablation of metallic titanium and characterization of plasma evolution. Emission of titanium monoxide is discussed together with modeling of diatomic spectra to infer temperature. The behavior of titanium particles in plasma draws research interests ranging from the modeling of stellar atmospheres to the enhancement of thin film production via pulsed laser deposition.
JCell--a Java-based framework for inferring regulatory networks from time series data.
Spieth, C; Supper, J; Streichert, F; Speer, N; Zell, A
2006-08-15
JCell is a Java-based application for reconstructing gene regulatory networks from experimental data. The framework provides several algorithms to identify genetic and metabolic dependencies based on experimental data conjoint with mathematical models to describe and simulate regulatory systems. Owing to the modular structure, researchers can easily implement new methods. JCell is a pure Java application with additional scripting capabilities and thus widely usable, e.g. on parallel or cluster computers. The software is freely available for download at http://www-ra.informatik.uni-tuebingen.de/software/JCell.
Permanent draft genomes of the Rhodopirellula maiorica strain SM1.
Richter, Michael; Richter-Heitmann, Tim; Klindworth, Anna; Wegner, Carl-Eric; Frank, Carsten S; Harder, Jens; Glöckner, Frank Oliver
2014-02-01
The genome of Rhodopirellula maiorica strain SM1 was sequenced as a permanent draft to complement the full genome sequence of the type strain Rhodopirellula baltica SH1(T). This isolate is part of a larger study to infer the biogeography of Rhodopirellula species in European marine waters, as well as to amend the genus description of R. baltica. This genomics resource article is the fifth of a series of five publications reporting in total eight new permanent daft genomes of Rhodopirellula species. Copyright © 2013 Elsevier B.V. All rights reserved.
1980-12-01
Mountains Anticline north of the Butter Creek drainage (sec. 4, T1, R2SE) at Service Butte, and is inferred to continue along a series of small, isolated...Anticline. The Butter Creek drainage lies on the northern flank of the east- west trending Reith Anticline and associated folds in an area where regional...Anticline. South of Butter Creek, about 1 mile (1.6 kin) west of Vey Ranch, a major change in upland maximum elevation was observed with the eastern
A study of remote sensing as applied to regional and small watersheds. Volume 1: Summary report
NASA Technical Reports Server (NTRS)
Ambaruch, R.
1974-01-01
The accuracy of remotely sensed measurements to provide inputs to hydrologic models of watersheds is studied. A series of sensitivity analyses on continuous simulation models of three watersheds determined: (1)Optimal values and permissible tolerances of inputs to achieve accurate simulation of streamflow from the watersheds; (2) Which model inputs can be quantified from remote sensing, directly, indirectly or by inference; and (3) How accurate remotely sensed measurements (from spacecraft or aircraft) must be to provide a basis for quantifying model inputs within permissible tolerances.
Local dependence in random graph models: characterization, properties and statistical inference
Schweinberger, Michael; Handcock, Mark S.
2015-01-01
Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142
Statistical inference for noisy nonlinear ecological dynamic systems.
Wood, Simon N
2010-08-26
Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.
Godsey, Brian; Heiser, Diane; Civin, Curt
2012-01-01
MicroRNAs (miRs) are known to play an important role in mRNA regulation, often by binding to complementary sequences in "target" mRNAs. Recently, several methods have been developed by which existing sequence-based target predictions can be combined with miR and mRNA expression data to infer true miR-mRNA targeting relationships. It has been shown that the combination of these two approaches gives more reliable results than either by itself. While a few such algorithms give excellent results, none fully addresses expression data sets with a natural ordering of the samples. If the samples in an experiment can be ordered or partially ordered by their expected similarity to one another, such as for time-series or studies of development processes, stages, or types, (e.g. cell type, disease, growth, aging), there are unique opportunities to infer miR-mRNA interactions that may be specific to the underlying processes, and existing methods do not exploit this. We propose an algorithm which specifically addresses [partially] ordered expression data and takes advantage of sample similarities based on the ordering structure. This is done within a Bayesian framework which specifies posterior distributions and therefore statistical significance for each model parameter and latent variable. We apply our model to a previously published expression data set of paired miR and mRNA arrays in five partially ordered conditions, with biological replicates, related to multiple myeloma, and we show how considering potential orderings can improve the inference of miR-mRNA interactions, as measured by existing knowledge about the involved transcripts.
Kepler light-curve analysis of the blazar W2R 1926+42
NASA Astrophysics Data System (ADS)
Mohan, P.; Gupta, Alok C.; Bachev, Rumen; Strigachev, Anton
2016-02-01
We study the long term Kepler light curve of the blazar W2R 1926+42 (˜1.6 yr) which indicates a variety of variability properties during different intervals of observation. The normalized excess variance, Fvar ranges from 1.8 per cent in the quiescent phase and 43.3 per cent in the outburst phase. We find no significant deviation from linearity in the Fvar-flux relation. Time series analysis is conducted using the Fourier power spectrum and the wavelet analysis methods to study the power spectral density (PSD) shape, infer characteristic time-scales and statistically significant quasi-periodic oscillations (QPOs). A bending power law with an associated time-scale of T_B = 6.2^{+6.4}_{-3.1} hours is inferred in the PSD analysis. We obtain a black hole mass of M• = (1.5-5.9) × 107 M⊙ for the first time using Fvar and the bend time-scale for this source. From a mean outburst lifetime of days, we infer a distance from the jet base r ≤ 1.75 pc indicating that the outburst originates due to a shock. A possible QPO peaked at 9.1 d and lasting 3.4 cycles is inferred from the wavelet analysis. Assuming that the QPO is a true feature, r = (152-378)GM•/c2 and supported by the other timing analysis products such as a weighted mean PSD slope of -1.5 ± 0.2 from the PSD analysis, we argue that the observed variability and the weak and short duration QPO could be due to jet based processes including orbital features in a relativistic helical jet and others such as shocks and turbulence.
Effective network inference through multivariate information transfer estimation
NASA Astrophysics Data System (ADS)
Dahlqvist, Carl-Henrik; Gnabo, Jean-Yves
2018-06-01
Network representation has steadily gained in popularity over the past decades. In many disciplines such as finance, genetics, neuroscience or human travel to cite a few, the network may not directly be observable and needs to be inferred from time-series data, leading to the issue of separating direct interactions between two entities forming the network from indirect interactions coming through its remaining part. Drawing on recent contributions proposing strategies to deal with this problem such as the so-called "global silencing" approach of Barzel and Barabasi or "network deconvolution" of Feizi et al. (2013), we propose a novel methodology to infer an effective network structure from multivariate conditional information transfers. Its core principal is to test the information transfer between two nodes through a step-wise approach by conditioning the transfer for each pair on a specific set of relevant nodes as identified by our algorithm from the rest of the network. The methodology is model free and can be applied to high-dimensional networks with both inter-lag and intra-lag relationships. It outperforms state-of-the-art approaches for eliminating the redundancies and more generally retrieving simulated artificial networks in our Monte-Carlo experiments. We apply the method to stock market data at different frequencies (15 min, 1 h, 1 day) to retrieve the network of US largest financial institutions and then document how bank's centrality measurements relate to bank's systemic vulnerability.
Reverse engineering biological networks :applications in immune responses to bio-toxins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martino, Anthony A.; Sinclair, Michael B.; Davidson, George S.
Our aim is to determine the network of events, or the regulatory network, that defines an immune response to a bio-toxin. As a model system, we are studying T cell regulatory network triggered through tyrosine kinase receptor activation using a combination of pathway stimulation and time-series microarray experiments. Our approach is composed of five steps (1) microarray experiments and data error analysis, (2) data clustering, (3) data smoothing and discretization, (4) network reverse engineering, and (5) network dynamics analysis and fingerprint identification. The technological outcome of this study is a suite of experimental protocols and computational tools that reverse engineermore » regulatory networks provided gene expression data. The practical biological outcome of this work is an immune response fingerprint in terms of gene expression levels. Inferring regulatory networks from microarray data is a new field of investigation that is no more than five years old. To the best of our knowledge, this work is the first attempt that integrates experiments, error analyses, data clustering, inference, and network analysis to solve a practical problem. Our systematic approach of counting, enumeration, and sampling networks matching experimental data is new to the field of network reverse engineering. The resulting mathematical analyses and computational tools lead to new results on their own and should be useful to others who analyze and infer networks.« less
A new data-driven model for post-transplant antibody dynamics in high risk kidney transplantation.
Zhang, Yan; Briggs, David; Lowe, David; Mitchell, Daniel; Daga, Sunil; Krishnan, Nithya; Higgins, Robert; Khovanova, Natasha
2017-02-01
The dynamics of donor specific human leukocyte antigen antibodies during early stage after kidney transplantation are of great clinical interest as these antibodies are considered to be associated with short and long term clinical outcomes. The limited number of antibody time series and their diverse patterns have made the task of modelling difficult. Focusing on one typical post-transplant dynamic pattern with rapid falls and stable settling levels, a novel data-driven model has been developed for the first time. A variational Bayesian inference method has been applied to select the best model and learn its parameters for 39 time series from two groups of graft recipients, i.e. patients with and without acute antibody-mediated rejection (AMR) episodes. Linear and nonlinear dynamic models of different order were attempted to fit the time series, and the third order linear model provided the best description of the common features in both groups. Both deterministic and stochastic parameters are found to be significantly different in the AMR and no-AMR groups showing that the time series in the AMR group have significantly higher frequency of oscillations and faster dissipation rates. This research may potentially lead to better understanding of the immunological mechanisms involved in kidney transplantation. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Muhs, Daniel; Simmons, Kathleen R.
2017-01-01
Although uranium series (U-series) ages of growth-position fossil corals are important to Quaternary sea-level history, coral clast reworking from storms can yield ages on a terrace dating to more than one high-sea stand, confounding interpretations of sea-level history. On northern Barbados, U-series ages corals from a thick storm deposit are not always younger with successively higher stratigraphic positions, but all date to the last interglacial period (~127 ka to ~112 ka), Marine Isotope Substage (MIS) 5.5. The storm deposit ages are consistent with the ages of growth-position corals found at the base of the section and at landward localities on this terrace. Thus, in this case, analysis of only a few corals would not have led to an error in interpreting sea-level history. In contrast, a notch cut into older Pleistocene limestone below the MIS 5.5 terrace contains corals that date to both MIS 5.5 (~125 ka) and MIS 5.3 (~108 ka). We infer that the notch formed during MIS 5.3 and the MIS 5.5 corals are reworked. Similar multiple ages of corals on terraces have been reported elsewhere on Barbados. Thus, care must be taken in interpreting U-series ages of corals that are reported without consideration of taphonomy.
NASA Astrophysics Data System (ADS)
Godsey, S. E.; Kirchner, J. W.
2008-12-01
The mean residence time - the average time that it takes rainfall to reach the stream - is a basic parameter used to characterize catchment processes. Heterogeneities in these processes lead to a distribution of travel times around the mean residence time. By examining this travel time distribution, we can better predict catchment response to contamination events. A catchment system with shorter residence times or narrower distributions will respond quickly to contamination events, whereas systems with longer residence times or longer-tailed distributions will respond more slowly to those same contamination events. The travel time distribution of a catchment is typically inferred from time series of passive tracers (e.g., water isotopes or chloride) in precipitation and streamflow. Variations in the tracer concentration in streamflow are usually damped compared to those in precipitation, because precipitation inputs from different storms (with different tracer signatures) are mixed within the catchment. Mathematically, this mixing process is represented by the convolution of the travel time distribution and the precipitation tracer inputs to generate the stream tracer outputs. Because convolution in the time domain is equivalent to multiplication in the frequency domain, it is relatively straightforward to estimate the parameters of the travel time distribution in either domain. In the time domain, the parameters describing the travel time distribution are typically estimated by maximizing the goodness of fit between the modeled and measured tracer outputs. In the frequency domain, the travel time distribution parameters can be estimated by fitting a power-law curve to the ratio of precipitation spectral power to stream spectral power. Differences between the methods of parameter estimation in the time and frequency domain mean that these two methods may respond differently to variations in data quality, record length and sampling frequency. Here we evaluate how well these two methods of travel time parameter estimation respond to different sources of uncertainty and compare the methods to one another. We do this by generating synthetic tracer input time series of different lengths, and convolve these with specified travel-time distributions to generate synthetic output time series. We then sample both the input and output time series at various sampling intervals and corrupt the time series with realistic error structures. Using these 'corrupted' time series, we infer the apparent travel time distribution, and compare it to the known distribution that was used to generate the synthetic data in the first place. This analysis allows us to quantify how different record lengths, sampling intervals, and error structures in the tracer measurements affect the apparent mean residence time and the apparent shape of the travel time distribution.
The statistical analysis of circadian phase and amplitude in constant-routine core-temperature data
NASA Technical Reports Server (NTRS)
Brown, E. N.; Czeisler, C. A.
1992-01-01
Accurate estimation of the phases and amplitude of the endogenous circadian pacemaker from constant-routine core-temperature series is crucial for making inferences about the properties of the human biological clock from data collected under this protocol. This paper presents a set of statistical methods based on a harmonic-regression-plus-correlated-noise model for estimating the phases and the amplitude of the endogenous circadian pacemaker from constant-routine core-temperature data. The methods include a Bayesian Monte Carlo procedure for computing the uncertainty in these circadian functions. We illustrate the techniques with a detailed study of a single subject's core-temperature series and describe their relationship to other statistical methods for circadian data analysis. In our laboratory, these methods have been successfully used to analyze more than 300 constant routines and provide a highly reliable means of extracting phase and amplitude information from core-temperature data.
Aspinall, Richard
2004-08-01
This paper develops an approach to modelling land use change that links model selection and multi-model inference with empirical models and GIS. Land use change is frequently studied, and understanding gained, through a process of modelling that is an empirical analysis of documented changes in land cover or land use patterns. The approach here is based on analysis and comparison of multiple models of land use patterns using model selection and multi-model inference. The approach is illustrated with a case study of rural housing as it has developed for part of Gallatin County, Montana, USA. A GIS contains the location of rural housing on a yearly basis from 1860 to 2000. The database also documents a variety of environmental and socio-economic conditions. A general model of settlement development describes the evolution of drivers of land use change and their impacts in the region. This model is used to develop a series of different models reflecting drivers of change at different periods in the history of the study area. These period specific models represent a series of multiple working hypotheses describing (a) the effects of spatial variables as a representation of social, economic and environmental drivers of land use change, and (b) temporal changes in the effects of the spatial variables as the drivers of change evolve over time. Logistic regression is used to calibrate and interpret these models and the models are then compared and evaluated with model selection techniques. Results show that different models are 'best' for the different periods. The different models for different periods demonstrate that models are not invariant over time which presents challenges for validation and testing of empirical models. The research demonstrates (i) model selection as a mechanism for rating among many plausible models that describe land cover or land use patterns, (ii) inference from a set of models rather than from a single model, (iii) that models can be developed based on hypothesised relationships based on consideration of underlying and proximate causes of change, and (iv) that models are not invariant over time.
Kipiński, Lech; König, Reinhard; Sielużycki, Cezary; Kordecki, Wojciech
2011-10-01
Stationarity is a crucial yet rarely questioned assumption in the analysis of time series of magneto- (MEG) or electroencephalography (EEG). One key drawback of the commonly used tests for stationarity of encephalographic time series is the fact that conclusions on stationarity are only indirectly inferred either from the Gaussianity (e.g. the Shapiro-Wilk test or Kolmogorov-Smirnov test) or the randomness of the time series and the absence of trend using very simple time-series models (e.g. the sign and trend tests by Bendat and Piersol). We present a novel approach to the analysis of the stationarity of MEG and EEG time series by applying modern statistical methods which were specifically developed in econometrics to verify the hypothesis that a time series is stationary. We report our findings of the application of three different tests of stationarity--the Kwiatkowski-Phillips-Schmidt-Schin (KPSS) test for trend or mean stationarity, the Phillips-Perron (PP) test for the presence of a unit root and the White test for homoscedasticity--on an illustrative set of MEG data. For five stimulation sessions, we found already for short epochs of duration of 250 and 500 ms that, although the majority of the studied epochs of single MEG trials were usually mean-stationary (KPSS test and PP test), they were classified as nonstationary due to their heteroscedasticity (White test). We also observed that the presence of external auditory stimulation did not significantly affect the findings regarding the stationarity of the data. We conclude that the combination of these tests allows a refined analysis of the stationarity of MEG and EEG time series.
Challenges to validity in single-group interrupted time series analysis.
Linden, Ariel
2017-04-01
Single-group interrupted time series analysis (ITSA) is a popular evaluation methodology in which a single unit of observation is studied; the outcome variable is serially ordered as a time series, and the intervention is expected to "interrupt" the level and/or trend of the time series, subsequent to its introduction. The most common threat to validity is history-the possibility that some other event caused the observed effect in the time series. Although history limits the ability to draw causal inferences from single ITSA models, it can be controlled for by using a comparable control group to serve as the counterfactual. Time series data from 2 natural experiments (effect of Florida's 2000 repeal of its motorcycle helmet law on motorcycle fatalities and California's 1988 Proposition 99 to reduce cigarette sales) are used to illustrate how history biases results of single-group ITSA results-as opposed to when that group's results are contrasted to those of a comparable control group. In the first example, an external event occurring at the same time as the helmet repeal appeared to be the cause of a rise in motorcycle deaths, but was only revealed when Florida was contrasted with comparable control states. Conversely, in the second example, a decreasing trend in cigarette sales prior to the intervention raised question about a treatment effect attributed to Proposition 99, but was reinforced when California was contrasted with comparable control states. Results of single-group ITSA should be considered preliminary, and interpreted with caution, until a more robust study design can be implemented. © 2016 John Wiley & Sons, Ltd.
Modeling of nutation-precession: Very long baseline interferometry results
NASA Astrophysics Data System (ADS)
Herring, T. A.; Mathews, P. M.; Buffett, B. A.
2002-04-01
Analysis of over 20 years of very long baseline interferometry data (VLBI) yields estimates of the coefficients of the nutation series with standard deviations ranging from 5 microseconds of arc (μas) for the terms with periods <400 days to 38 μas for the longest-period terms. The largest deviations between the VLBI estimates of the amplitudes of terms in the nutation series and the theoretical values from the Mathews-Herring-Buffett (MHB2000) nutation series are 56 +/- 38 μas (associated with two of the 18.6 year nutations). The amplitudes of nutational terms with periods <400 days deviate from the MHB2000 nutation series values at the level standard deviation. The estimated correction to the IAU-1976 precession constant is -2.997 +/- 0.008 mas yr-1 when the coefficients of the MHB2000 nutation series are held fixed and is consistent with that inferred from the MHB2000 nutation theory. The secular change in the obliquity of the ecliptic is estimated to be -0.252 +/- 0.003 mas yr-1. When the coefficients of the largest-amplitude terms in the nutation series are estimated, the precession constant correction and obliquity rate are estimated to be -2.960 +/- 0.030 and -0.237 +/- 0.012 mas yr-1. Significant variations in the freely excited retrograde free core nutation mode are observed over the 20 years. During this time the amplitude has decreased from ~300 +/- 50 μas in the mid-1980s to nearly zero by the year 2000. There is evidence that the amplitude of the mode in now increasing again.
NASA Astrophysics Data System (ADS)
Masiokas, M. H.; Villalba, R.; Christie, D. A.; Betman, E.; Luckman, B. H.; Le Quesne, C.; Prieto, M. R.; Mauget, S.
2012-03-01
The Andean snowpack is the main source of freshwater and arguably the single most important natural resource for the populated, semi-arid regions of central Chile and central-western Argentina. However, apart from recent analyses of instrumental snowpack data, very little is known about the long term variability of this key natural resource. Here we present two complementary, annually-resolved reconstructions of winter snow accumulation in the southern Andes between 30°-37°S. The reconstructions cover the past 850 years and were developed using simple regression models based on snowpack proxies with different inherent limitations. Rainfall data from central Chile (very strongly correlated with snow accumulation values in the adjacent mountains) were used to extend a regional 1951-2010 snowpack record back to AD 1866. Subsequently, snow accumulation variations since AD 1150 were inferred from precipitation-sensitive tree-ring width series. The reconstructed snowpack values were validated with independent historical and instrumental information. An innovative time series analysis approach allowed the identification of the onset, duration and statistical significance of the main intra- to multi-decadal patterns in the reconstructions and indicates that variations observed in the last 60 years are not particularly anomalous when assessed in a multi-century context. In addition to providing new information on past variations for a highly relevant hydroclimatic variable in the southern Andes, the snowpack reconstructions can also be used to improve the understanding and modeling of related, larger-scale atmospheric features such as ENSO and the PDO.
Zeri, Marcelo; Sá, Leonardo D A; Manzi, Antônio O; Araújo, Alessandro C; Aguiar, Renata G; von Randow, Celso; Sampaio, Gilvan; Cardoso, Fernando L; Nobre, Carlos A
2014-01-01
The carbon and water cycles for a southwestern Amazonian forest site were investigated using the longest time series of fluxes of CO2 and water vapor ever reported for this site. The period from 2004 to 2010 included two severe droughts (2005 and 2010) and a flooding year (2009). The effects of such climate extremes were detected in annual sums of fluxes as well as in other components of the carbon and water cycles, such as gross primary production and water use efficiency. Gap-filling and flux-partitioning were applied in order to fill gaps due to missing data, and errors analysis made it possible to infer the uncertainty on the carbon balance. Overall, the site was found to have a net carbon uptake of ≈5 t C ha(-1) year(-1), but the effects of the drought of 2005 were still noticed in 2006, when the climate disturbance caused the site to become a net source of carbon to the atmosphere. Different regions of the Amazon forest might respond differently to climate extremes due to differences in dry season length, annual precipitation, species compositions, albedo and soil type. Longer time series of fluxes measured over several locations are required to better characterize the effects of climate anomalies on the carbon and water balances for the whole Amazon region. Such valuable datasets can also be used to calibrate biogeochemical models and infer on future scenarios of the Amazon forest carbon balance under the influence of climate change.
Linden, Ariel; Adams, John L
2011-12-01
Often, when conducting programme evaluations or studying the effects of policy changes, researchers may only have access to aggregated time series data, presented as observations spanning both the pre- and post-intervention periods. The most basic analytic model using these data requires only a single group and models the intervention effect using repeated measurements of the dependent variable. This model controls for regression to the mean and is likely to detect a treatment effect if it is sufficiently large. However, many potential sources of bias still remain. Adding one or more control groups to this model could strengthen causal inference if the groups are comparable on pre-intervention covariates and level and trend of the dependent variable. If this condition is not met, the validity of the study findings could be called into question. In this paper we describe a propensity score-based weighted regression model, which overcomes these limitations by weighting the control groups to represent the average outcome that the treatment group would have exhibited in the absence of the intervention. We illustrate this technique studying cigarette sales in California before and after the passage of Proposition 99 in California in 1989. While our results were similar to those of the Synthetic Control method, the weighting approach has the advantage of being technically less complicated, rooted in regression techniques familiar to most researchers, easy to implement using any basic statistical software, may accommodate any number of treatment units, and allows for greater flexibility in the choice of treatment effect estimators. © 2010 Blackwell Publishing Ltd.
Translating research in elder care: an introduction to a study protocol series
Estabrooks, Carole A; Hutchinson, Alison M; Squires, Janet E; Birdsell, Judy; Cummings, Greta G; Degner, Lesley; Morgan, Debra; Norton, Peter G
2009-01-01
Background The knowledge translation field is undermined by two interrelated gaps – underdevelopment of the science and limited use of research in health services and health systems decision making. The importance of context in theory development and successful translation of knowledge has been identified in past research. Additionally, examination of knowledge translation in the long-term care (LTC) sector has been seriously neglected, despite the fact that aging is increasingly identified as a priority area in health and health services research. Aims The aims of this study are: to build knowledge translation theory about the role of organizational context in influencing knowledge use in LTC settings and among regulated and unregulated caregivers, to pilot knowledge translation interventions, and to contribute to enhanced use of new knowledge in LTC. Design This is a multi-level and longitudinal program of research comprising two main interrelated projects and a series of pilot studies. An integrated mixed method design will be used, including sequential and simultaneous phases to enable the projects to complement and inform one another. Inferences drawn from the quantitative and qualitative analyses will be merged to create meta-inferences. Outcomes Outcomes will include contributions to (knowledge translation) theory development, progress toward resolution of major conceptual issues in the field, progress toward resolution of methodological problems in the field, and advances in the design of effective knowledge translation strategies. Importantly, a better understanding of the contextual influences on knowledge use in LTC will contribute to improving outcomes for residents and providers in LTC settings. PMID:19664285
Zeri, Marcelo; Sá, Leonardo D. A.; Manzi, Antônio O.; Araújo, Alessandro C.; Aguiar, Renata G.; von Randow, Celso; Sampaio, Gilvan; Cardoso, Fernando L.; Nobre, Carlos A.
2014-01-01
The carbon and water cycles for a southwestern Amazonian forest site were investigated using the longest time series of fluxes of CO2 and water vapor ever reported for this site. The period from 2004 to 2010 included two severe droughts (2005 and 2010) and a flooding year (2009). The effects of such climate extremes were detected in annual sums of fluxes as well as in other components of the carbon and water cycles, such as gross primary production and water use efficiency. Gap-filling and flux-partitioning were applied in order to fill gaps due to missing data, and errors analysis made it possible to infer the uncertainty on the carbon balance. Overall, the site was found to have a net carbon uptake of ≈5 t C ha−1 year−1, but the effects of the drought of 2005 were still noticed in 2006, when the climate disturbance caused the site to become a net source of carbon to the atmosphere. Different regions of the Amazon forest might respond differently to climate extremes due to differences in dry season length, annual precipitation, species compositions, albedo and soil type. Longer time series of fluxes measured over several locations are required to better characterize the effects of climate anomalies on the carbon and water balances for the whole Amazon region. Such valuable datasets can also be used to calibrate biogeochemical models and infer on future scenarios of the Amazon forest carbon balance under the influence of climate change. PMID:24558378
2014-01-01
Background Network inference of gene expression data is an important challenge in systems biology. Novel algorithms may provide more detailed gene regulatory networks (GRN) for complex, chronic inflammatory diseases such as rheumatoid arthritis (RA), in which activated synovial fibroblasts (SFBs) play a major role. Since the detailed mechanisms underlying this activation are still unclear, simultaneous investigation of multi-stimuli activation of SFBs offers the possibility to elucidate the regulatory effects of multiple mediators and to gain new insights into disease pathogenesis. Methods A GRN was therefore inferred from RA-SFBs treated with 4 different stimuli (IL-1 β, TNF- α, TGF- β, and PDGF-D). Data from time series microarray experiments (0, 1, 2, 4, 12 h; Affymetrix HG-U133 Plus 2.0) were batch-corrected applying ‘ComBat’, analyzed for differentially expressed genes over time with ‘Limma’, and used for the inference of a robust GRN with NetGenerator V2.0, a heuristic ordinary differential equation-based method with soft integration of prior knowledge. Results Using all genes differentially expressed over time in RA-SFBs for any stimulus, and selecting the genes belonging to the most significant gene ontology (GO) term, i.e., ‘cartilage development’, a dynamic, robust, moderately complex multi-stimuli GRN was generated with 24 genes and 57 edges in total, 31 of which were gene-to-gene edges. Prior literature-based knowledge derived from Pathway Studio or manual searches was reflected in the final network by 25/57 confirmed edges (44%). The model contained known network motifs crucial for dynamic cellular behavior, e.g., cross-talk among pathways, positive feed-back loops, and positive feed-forward motifs (including suppression of the transcriptional repressor OSR2 by all 4 stimuli. Conclusion A multi-stimuli GRN highly concordant with literature data was successfully generated by network inference from the gene expression of stimulated RA-SFBs. The GRN showed high reliability, since 10 predicted edges were independently validated by literature findings post network inference. The selected GO term ‘cartilage development’ contained a number of differentiation markers, growth factors, and transcription factors with potential relevance for RA. Finally, the model provided new insight into the response of RA-SFBs to multiple stimuli implicated in the pathogenesis of RA, in particular to the ‘novel’ potent growth factor PDGF-D. PMID:24989895
Krafty, Robert T; Rosen, Ori; Stoffer, David S; Buysse, Daniel J; Hall, Martica H
2017-01-01
This article considers the problem of analyzing associations between power spectra of multiple time series and cross-sectional outcomes when data are observed from multiple subjects. The motivating application comes from sleep medicine, where researchers are able to non-invasively record physiological time series signals during sleep. The frequency patterns of these signals, which can be quantified through the power spectrum, contain interpretable information about biological processes. An important problem in sleep research is drawing connections between power spectra of time series signals and clinical characteristics; these connections are key to understanding biological pathways through which sleep affects, and can be treated to improve, health. Such analyses are challenging as they must overcome the complicated structure of a power spectrum from multiple time series as a complex positive-definite matrix-valued function. This article proposes a new approach to such analyses based on a tensor-product spline model of Cholesky components of outcome-dependent power spectra. The approach exibly models power spectra as nonparametric functions of frequency and outcome while preserving geometric constraints. Formulated in a fully Bayesian framework, a Whittle likelihood based Markov chain Monte Carlo (MCMC) algorithm is developed for automated model fitting and for conducting inference on associations between outcomes and spectral measures. The method is used to analyze data from a study of sleep in older adults and uncovers new insights into how stress and arousal are connected to the amount of time one spends in bed.
Can reliable values of Young's modulus be deduced from Fisher's (1971) spinning lens measurements?
Burd, H J; Wilde, G S; Judge, S J
2006-04-01
The current textbook view of the causes of presbyopia rests very largely on a series of experiments reported by R.F. Fisher some three decades ago, and in particular on the values of lens Young's modulus inferred from the deformation caused by spinning excised lenses about their optical axis (Fisher 1971) We studied the extent to which inferred values of Young's modulus are influenced by assumptions inherent in the mathematical procedures used by Fisher to interpret the test and we investigated several alternative interpretation methods. The results suggest that modelling assumptions inherent in Fisher's original method may have led to systematic errors in the determination of the Young's modulus of the cortex and nucleus. Fisher's conclusion that the cortex is stiffer than the nucleus, particularly in middle age, may be an artefact associated with these systematic errors. Moreover, none of the models we explored are able to account for Fisher's claim that the removal of the capsule has only a modest effect on the deformations induced in the spinning lens.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sultan, M.; Sturchio, N.; Hassan, F. A.
1997-01-01
An Atlantic source of precipitation can be inferred from stable isotopic data (H and O) for fossil groundwaters and uranium-series-dated carbonate spring deposits from oases in the Western Desert of Egypt. In the context of available stable isotopic data for fossil groundwaters throughout North Africa, the observed isotopic depletions ({delta}D -72 to -81{per_thousand}; {delta}{sup 18}O -10.6 to -11.5{per_thousand}) of fossil ({ge}32,000 yr B.P.) groundwaters from the Nubian aquifer are best explained by progressive condensation of water vapor from paleowesterly wet oceanic air masses that traveled across North Africa and operated at least as far back as 450,000 yr before themore » present. The values of {delta}{sup 18}O (17.1 to 25.9{per_thousand}) for 45,000- to >450,000-yr-old tufas and vein-filling calcite deposits from the Kharga and Farafra Oases are consistent with deposition from groundwaters having oxygen isotopic compositions similar to those of fossil groundwaters sampled recently at these locations.« less
Adaptive optimal training of animal behavior
NASA Astrophysics Data System (ADS)
Bak, Ji Hyun; Choi, Jung Yoon; Akrami, Athena; Witten, Ilana; Pillow, Jonathan
Neuroscience experiments often require training animals to perform tasks designed to elicit various sensory, cognitive, and motor behaviors. Training typically involves a series of gradual adjustments of stimulus conditions and rewards in order to bring about learning. However, training protocols are usually hand-designed, and often require weeks or months to achieve a desired level of task performance. Here we combine ideas from reinforcement learning and adaptive optimal experimental design to formulate methods for efficient training of animal behavior. Our work addresses two intriguing problems at once: first, it seeks to infer the learning rules underlying an animal's behavioral changes during training; second, it seeks to exploit these rules to select stimuli that will maximize the rate of learning toward a desired objective. We develop and test these methods using data collected from rats during training on a two-interval sensory discrimination task. We show that we can accurately infer the parameters of a learning algorithm that describes how the animal's internal model of the task evolves over the course of training. We also demonstrate by simulation that our method can provide a substantial speedup over standard training methods.
NASA Astrophysics Data System (ADS)
Biggerstaff, Michael I.; Zounes, Zackery; Addison Alford, A.; Carrie, Gordon D.; Pilkey, John T.; Uman, Martin A.; Jordan, Douglas M.
2017-08-01
A series of vertical cross sections taken through a small mesoscale convective system observed over Florida by the dual-polarimetric SMART radar were combined with VHF radiation source locations from a lightning mapping array (LMA) to examine the lightning channel propagation paths relative to the radar-observed ice alignment signatures associated with regions of negative specific differential phase (
Naima: a Python package for inference of particle distribution properties from nonthermal spectra
NASA Astrophysics Data System (ADS)
Zabalza, V.
2015-07-01
The ultimate goal of the observation of nonthermal emission from astrophysical sources is to understand the underlying particle acceleration and evolution processes, and few tools are publicly available to infer the particle distribution properties from the observed photon spectra from X-ray to VHE gamma rays. Here I present naima, an open source Python package that provides models for nonthermal radiative emission from homogeneous distribution of relativistic electrons and protons. Contributions from synchrotron, inverse Compton, nonthermal bremsstrahlung, and neutral-pion decay can be computed for a series of functional shapes of the particle energy distributions, with the possibility of using user-defined particle distribution functions. In addition, naima provides a set of functions that allow to use these models to fit observed nonthermal spectra through an MCMC procedure, obtaining probability distribution functions for the particle distribution parameters. Here I present the models and methods available in naima and an example of their application to the understanding of a galactic nonthermal source. naima's documentation, including how to install the package, is available at http://naima.readthedocs.org.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schad, A.; Timmer, J.; Roth, M.
2011-06-20
Measurements from tracers and local helioseismology indicate the existence of a meridional flow in the Sun with strength in the order of 15 m s{sup -1} near the solar surface. Different attempts were made to obtain information on the flow profile at depths up to 20 Mm below the solar surface. We propose a method using global helioseismic Doppler measurements with the prospect of inferring the meridional flow profile at greater depths. Our approach is based on the perturbation of the p-mode eigenfunctions of a solar model due to the presence of a flow. The distortion of the oscillation eigenfunctionsmore » is manifested in the mixing of p-modes, which may be measured from global solar oscillation time series. As a new helioseismic measurement quantity, we propose amplitude ratios between oscillations in the Fourier domain. We relate this quantity to the meridional flow and unify the concepts presented here for an inversion procedure to infer the meridional flow from global solar oscillations.« less
X-ray power and yield measurements at the refurbished Z machine
Jones, M. C.; Ampleford, D. J.; Cuneo, M. E.; ...
2014-08-04
Advancements have been made in the diagnostic techniques to measure accurately the total radiated x-ray yield and power from z-pinch loads at the Z Machine with high accuracy. The Z-accelerator is capable of outputting 2MJ and 330 TW of x-ray yield and power, and accurately measuring these quantities is imperative. We will describe work over the past several years which include the development of new diagnostics, improvements to existing diagnostics, and implementation of automated data analysis routines. A set of experiments were conducted on the Z machine where the load and machine configuration were held constant. During this shot series,more » it was observed that total z-pinch x-ray emission power determined from the two common techniques for inferring the x-ray power, Kimfol filtered x-ray diode diagnostic and the Total Power and Energy diagnostic gave 450 TW and 327 TW respectively. Our analysis shows the latter to be the more accurate interpretation. More broadly, the comparison demonstrates the necessity to consider spectral response and field of view when inferring xray powers from z-pinch sources.« less
Temporal patterns of phytoplankton abundance in the North Atlantic
NASA Technical Reports Server (NTRS)
Campbell, Janet W.
1989-01-01
A time series of CZCS images is being developed to study phytoplankton distribution patterns in the North Atlantic. The goal of this study is to observe temporal variability in phytoplankton pigments and other organic particulates, and to infer from these patterns the potential flux of biogenic materials from the euphotic layer to the deep ocean. Early results of this project are presented in this paper. Specifically, the satellite data used were 13 monthly composited images of CZCS data for the North Atlantic from January 1979 to January 1980. Results are presented for seasonal patterns along the 20 deg W meridian.
Permanent draft genomes of the three Rhodopirellula baltica strains SH28, SWK14 and WH47.
Richter, Michael; Richter-Heitmann, Tim; Klindworth, Anna; Wegner, Carl-Eric; Frank, Carsten S; Harder, Jens; Glöckner, Frank Oliver
2014-02-01
The genomes of three Rhodopirellula baltica strains were sequenced as permanent drafts to complement the full genome sequence of the type strain R. baltica SH1(T). The isolates are part of a larger study to infer the biogeography of Rhodopirellula species in European marine waters, as well as to amend the genus description of R. baltica. This genomics resource article is the first of a series of five publications reporting in total eight new permanent daft genomes of Rhodopirellula species. Copyright © 2013 Elsevier B.V. All rights reserved.
Modeling time-series data from microbial communities.
Ridenhour, Benjamin J; Brooker, Sarah L; Williams, Janet E; Van Leuven, James T; Miller, Aaron W; Dearing, M Denise; Remien, Christopher H
2017-11-01
As sequencing technologies have advanced, the amount of information regarding the composition of bacterial communities from various environments (for example, skin or soil) has grown exponentially. To date, most work has focused on cataloging taxa present in samples and determining whether the distribution of taxa shifts with exogenous covariates. However, important questions regarding how taxa interact with each other and their environment remain open thus preventing in-depth ecological understanding of microbiomes. Time-series data from 16S rDNA amplicon sequencing are becoming more common within microbial ecology, but methods to infer ecological interactions from these longitudinal data are limited. We address this gap by presenting a method of analysis using Poisson regression fit with an elastic-net penalty that (1) takes advantage of the fact that the data are time series; (2) constrains estimates to allow for the possibility of many more interactions than data; and (3) is scalable enough to handle data consisting of thousands of taxa. We test the method on gut microbiome data from white-throated woodrats (Neotoma albigula) that were fed varying amounts of the plant secondary compound oxalate over a period of 22 days to estimate interactions between OTUs and their environment.
Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen
2017-09-25
In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.
Simulating maar-diatreme volcanic systems in bench-scale experiments
NASA Astrophysics Data System (ADS)
Andrews, R. G.; White, J. D. L.; Dürig, T.; Zimanowski, B.
2015-12-01
Maar-diatreme eruptions are incompletely understood, and explanations for the processes involved in them have been debated for decades. This study extends bench-scale analogue experiments previously conducted on maar-diatreme systems and attempts to scale the results up to both field-scale experimentation and natural volcanic systems in order to produce a reconstructive toolkit for maar volcanoes. These experimental runs produced via multiple mechanisms complex deposits that match many features seen in natural maar-diatreme deposits. The runs include deeper single blasts, series of descending discrete blasts, and series of ascending blasts. Debris-jet inception and diatreme formation are indicated by this study to involve multiple types of granular fountains within diatreme deposits produced under varying initial conditions. The individual energies of blasts in multiple-blast series are not possible to infer from the final deposits. The depositional record of blast sequences can be ascertained from the proportion of fallback sedimentation versus maar ejecta rim material, the final crater size and the degree of overturning or slumping of accessory strata. Quantitatively, deeper blasts involve a roughly equal partitioning of energy into crater excavation energy versus mass movement of juvenile material, whereas shallower blasts expend a much greater proportion of energy in crater excavation.
NASA Astrophysics Data System (ADS)
Wang, Wen-Chuan; Chau, Kwok-Wing; Cheng, Chun-Tian; Qiu, Lin
2009-08-01
SummaryDeveloping a hydrological forecasting model based on past records is crucial to effective hydropower reservoir management and scheduling. Traditionally, time series analysis and modeling is used for building mathematical models to generate hydrologic records in hydrology and water resources. Artificial intelligence (AI), as a branch of computer science, is capable of analyzing long-series and large-scale hydrological data. In recent years, it is one of front issues to apply AI technology to the hydrological forecasting modeling. In this paper, autoregressive moving-average (ARMA) models, artificial neural networks (ANNs) approaches, adaptive neural-based fuzzy inference system (ANFIS) techniques, genetic programming (GP) models and support vector machine (SVM) method are examined using the long-term observations of monthly river flow discharges. The four quantitative standard statistical performance evaluation measures, the coefficient of correlation ( R), Nash-Sutcliffe efficiency coefficient ( E), root mean squared error (RMSE), mean absolute percentage error (MAPE), are employed to evaluate the performances of various models developed. Two case study river sites are also provided to illustrate their respective performances. The results indicate that the best performance can be obtained by ANFIS, GP and SVM, in terms of different evaluation criteria during the training and validation phases.
NASA Technical Reports Server (NTRS)
Green, J. W.; Knoll, A. H.; Golubic, S.; Swett, K.
1987-01-01
Populations of Polybessurus bipartitus Fairchild ex Green et al., a large morphologically distinctive microfossil, occur in silicified carbonates of the Upper Proterozoic (700-800 Ma) Limestone-Dolomite "Series," central East Greenland. Large populations of well-preserved individuals permit reconstruction of P. bipartitus as a coccoidal unicell that "jetted" upward from the sediment by the highly unidirectional secretion of extracellular mucopolysaccharide envelopes. Reproduction by baeocyte formation is inferred on the basis of clustered envelope stalks produced by small cells. Sedimentological evidence indicates that P. bipartitus formed surficial crusts locally within a shallow peritidal carbonate platform. Among living microorganisms a close morphological, reproductive, and behavioral counterpart to Polybessurus is provided by populations of an as yet underscribed cyanobacterium found in coastal Bahamian environments similar to those in which the Proterozoic fossils occur. In general morphology and "jetting" behavior, this population resembles species of the genus Cyanostylon, Geitler (1925), but reproduces via baeocyte formation. Polybessurus is but one of the more than two dozen taxa in the richly fossiliferous biota of the Limestone-Dolomite "Series." This distinctive population, along with co-occurring filamentous cyanobacteria and other microfossils, contributes to an increasingly refined picture of ecological heterogeneity in late Proterozoic oceans.
Green, J W; Knoll, A H; Golubic, S; Swett, K
1987-01-01
Populations of Polybessurus bipartitus Fairchild ex Green et al., a large morphologically distinctive microfossil, occur in silicified carbonates of the Upper Proterozoic (700-800 Ma) Limestone-Dolomite "Series," central East Greenland. Large populations of well-preserved individuals permit reconstruction of P. bipartitus as a coccoidal unicell that "jetted" upward from the sediment by the highly unidirectional secretion of extracellular mucopolysaccharide envelopes. Reproduction by baeocyte formation is inferred on the basis of clustered envelope stalks produced by small cells. Sedimentological evidence indicates that P. bipartitus formed surficial crusts locally within a shallow peritidal carbonate platform. Among living microorganisms a close morphological, reproductive, and behavioral counterpart to Polybessurus is provided by populations of an as yet underscribed cyanobacterium found in coastal Bahamian environments similar to those in which the Proterozoic fossils occur. In general morphology and "jetting" behavior, this population resembles species of the genus Cyanostylon, Geitler (1925), but reproduces via baeocyte formation. Polybessurus is but one of the more than two dozen taxa in the richly fossiliferous biota of the Limestone-Dolomite "Series." This distinctive population, along with co-occurring filamentous cyanobacteria and other microfossils, contributes to an increasingly refined picture of ecological heterogeneity in late Proterozoic oceans.
Assessing dynamics, spatial scale, and uncertainty in task-related brain network analyses
Stephen, Emily P.; Lepage, Kyle Q.; Eden, Uri T.; Brunner, Peter; Schalk, Gerwin; Brumberg, Jonathan S.; Guenther, Frank H.; Kramer, Mark A.
2014-01-01
The brain is a complex network of interconnected elements, whose interactions evolve dynamically in time to cooperatively perform specific functions. A common technique to probe these interactions involves multi-sensor recordings of brain activity during a repeated task. Many techniques exist to characterize the resulting task-related activity, including establishing functional networks, which represent the statistical associations between brain areas. Although functional network inference is commonly employed to analyze neural time series data, techniques to assess the uncertainty—both in the functional network edges and the corresponding aggregate measures of network topology—are lacking. To address this, we describe a statistically principled approach for computing uncertainty in functional networks and aggregate network measures in task-related data. The approach is based on a resampling procedure that utilizes the trial structure common in experimental recordings. We show in simulations that this approach successfully identifies functional networks and associated measures of confidence emergent during a task in a variety of scenarios, including dynamically evolving networks. In addition, we describe a principled technique for establishing functional networks based on predetermined regions of interest using canonical correlation. Doing so provides additional robustness to the functional network inference. Finally, we illustrate the use of these methods on example invasive brain voltage recordings collected during an overt speech task. The general strategy described here—appropriate for static and dynamic network inference and different statistical measures of coupling—permits the evaluation of confidence in network measures in a variety of settings common to neuroscience. PMID:24678295
Assessing dynamics, spatial scale, and uncertainty in task-related brain network analyses.
Stephen, Emily P; Lepage, Kyle Q; Eden, Uri T; Brunner, Peter; Schalk, Gerwin; Brumberg, Jonathan S; Guenther, Frank H; Kramer, Mark A
2014-01-01
The brain is a complex network of interconnected elements, whose interactions evolve dynamically in time to cooperatively perform specific functions. A common technique to probe these interactions involves multi-sensor recordings of brain activity during a repeated task. Many techniques exist to characterize the resulting task-related activity, including establishing functional networks, which represent the statistical associations between brain areas. Although functional network inference is commonly employed to analyze neural time series data, techniques to assess the uncertainty-both in the functional network edges and the corresponding aggregate measures of network topology-are lacking. To address this, we describe a statistically principled approach for computing uncertainty in functional networks and aggregate network measures in task-related data. The approach is based on a resampling procedure that utilizes the trial structure common in experimental recordings. We show in simulations that this approach successfully identifies functional networks and associated measures of confidence emergent during a task in a variety of scenarios, including dynamically evolving networks. In addition, we describe a principled technique for establishing functional networks based on predetermined regions of interest using canonical correlation. Doing so provides additional robustness to the functional network inference. Finally, we illustrate the use of these methods on example invasive brain voltage recordings collected during an overt speech task. The general strategy described here-appropriate for static and dynamic network inference and different statistical measures of coupling-permits the evaluation of confidence in network measures in a variety of settings common to neuroscience.
Optimism bias leads to inconclusive results - an empirical study
Djulbegovic, Benjamin; Kumar, Ambuj; Magazin, Anja; Schroen, Anneke T.; Soares, Heloisa; Hozo, Iztok; Clarke, Mike; Sargent, Daniel; Schell, Michael J.
2010-01-01
Objective Optimism bias refers to unwarranted belief in the efficacy of new therapies. We assessed the impact of optimism bias on a proportion of trials that did not answer their research question successfully, and explored whether poor accrual or optimism bias is responsible for inconclusive results. Study Design Systematic review Setting Retrospective analysis of a consecutive series phase III randomized controlled trials (RCTs) performed under the aegis of National Cancer Institute Cooperative groups. Results 359 trials (374 comparisons) enrolling 150,232 patients were analyzed. 70% (262/374) of the trials generated conclusive results according to the statistical criteria. Investigators made definitive statements related to the treatment preference in 73% (273/374) of studies. Investigators’ judgments and statistical inferences were concordant in 75% (279/374) of trials. Investigators consistently overestimated their expected treatment effects, but to a significantly larger extent for inconclusive trials. The median ratio of expected over observed hazard ratio or odds ratio was 1.34 (range 0.19 – 15.40) in conclusive trials compared to 1.86 (range 1.09 – 12.00) in inconclusive studies (p<0.0001). Only 17% of the trials had treatment effects that matched original researchers’ expectations. Conclusion Formal statistical inference is sufficient to answer the research question in 75% of RCTs. The answers to the other 25% depend mostly on subjective judgments, which at times are in conflict with statistical inference. Optimism bias significantly contributes to inconclusive results. PMID:21163620
Optimism bias leads to inconclusive results-an empirical study.
Djulbegovic, Benjamin; Kumar, Ambuj; Magazin, Anja; Schroen, Anneke T; Soares, Heloisa; Hozo, Iztok; Clarke, Mike; Sargent, Daniel; Schell, Michael J
2011-06-01
Optimism bias refers to unwarranted belief in the efficacy of new therapies. We assessed the impact of optimism bias on a proportion of trials that did not answer their research question successfully and explored whether poor accrual or optimism bias is responsible for inconclusive results. Systematic review. Retrospective analysis of a consecutive-series phase III randomized controlled trials (RCTs) performed under the aegis of National Cancer Institute Cooperative groups. Three hundred fifty-nine trials (374 comparisons) enrolling 150,232 patients were analyzed. Seventy percent (262 of 374) of the trials generated conclusive results according to the statistical criteria. Investigators made definitive statements related to the treatment preference in 73% (273 of 374) of studies. Investigators' judgments and statistical inferences were concordant in 75% (279 of 374) of trials. Investigators consistently overestimated their expected treatment effects but to a significantly larger extent for inconclusive trials. The median ratio of expected and observed hazard ratio or odds ratio was 1.34 (range: 0.19-15.40) in conclusive trials compared with 1.86 (range: 1.09-12.00) in inconclusive studies (P<0.0001). Only 17% of the trials had treatment effects that matched original researchers' expectations. Formal statistical inference is sufficient to answer the research question in 75% of RCTs. The answers to the other 25% depend mostly on subjective judgments, which at times are in conflict with statistical inference. Optimism bias significantly contributes to inconclusive results. Copyright © 2011 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Liew, P. M.; Lee, C. Y.; Kuo, C. M.
2006-10-01
The East Asian monsoon Holocene optimal period has been debated both about duration and whether conditions were a maximum in thermal conditions or in precipitation. In this study we show Holocene climate variability inferred by a forest reconstruction of a subalpine pollen sequence from peat bog deposits in central Taiwan, based on modern analogues of various altitudinal biomes in the region. A warmer interval occurred between 8 and 4 ka BP (calibrated 14C years) when the subtropical forests were more extensive. The Holocene thermal optimum is represented by an altitudinal tropical forest at 6.1-5.9 ka BP and 6.9 ka BP and only the latter was accompanied by wet conditions, indicating decoupling of thermal and precipitation mechanism in the middle Holocene. Abrupt and relative severe cold phases, shown by biome changes, occurred at about 11.2-11.0 ka BP; 7.5 ka BP; 7.2 ka BP; 7.1 ka BP; 5.2 ka BP, 5.0 ka BP and 4.9 ka BP. A spectral analysis of pollen of a relatively cold taxon — Salix, reveals that the time series is dominated by a 1500 yr periodicity and similar to the cold cycle reported in the marine records of Indian and western Pacific Oceans. The cold-warm conditions inferred by the change of forests show close relationship to solar energy in comparison with the production rate of Be-10.
Emad, Amin; Milenkovic, Olgica
2014-01-01
We introduce a novel algorithm for inference of causal gene interactions, termed CaSPIAN (Causal Subspace Pursuit for Inference and Analysis of Networks), which is based on coupling compressive sensing and Granger causality techniques. The core of the approach is to discover sparse linear dependencies between shifted time series of gene expressions using a sequential list-version of the subspace pursuit reconstruction algorithm and to estimate the direction of gene interactions via Granger-type elimination. The method is conceptually simple and computationally efficient, and it allows for dealing with noisy measurements. Its performance as a stand-alone platform without biological side-information was tested on simulated networks, on the synthetic IRMA network in Saccharomyces cerevisiae, and on data pertaining to the human HeLa cell network and the SOS network in E. coli. The results produced by CaSPIAN are compared to the results of several related algorithms, demonstrating significant improvements in inference accuracy of documented interactions. These findings highlight the importance of Granger causality techniques for reducing the number of false-positives, as well as the influence of noise and sampling period on the accuracy of the estimates. In addition, the performance of the method was tested in conjunction with biological side information of the form of sparse “scaffold networks”, to which new edges were added using available RNA-seq or microarray data. These biological priors aid in increasing the sensitivity and precision of the algorithm in the small sample regime. PMID:24622336
An expert system shell for inferring vegetation characteristics
NASA Technical Reports Server (NTRS)
Harrison, P. Ann; Harrison, Patrick R.
1993-01-01
The NASA VEGetation Workbench (VEG) is a knowledge based system that infers vegetation characteristics from reflectance data. VEG is described in detail in several references. The first generation version of VEG was extended. In the first year of this contract, an interface to a file of unknown cover type data was constructed. An interface that allowed the results of VEG to be written to a file was also implemented. A learning system that learned class descriptions from a data base of historical cover type data and then used the learned class descriptions to classify an unknown sample was built. This system had an interface that integrated it into the rest of VEG. The VEG subgoal PROPORTION.GROUND.COVER was completed and a number of additional techniques that inferred the proportion ground cover of a sample were implemented. This work was previously described. The work carried out in the second year of the contract is described. The historical cover type database was removed from VEG and stored as a series of flat files that are external to VEG. An interface to the files was provided. The framework and interface for two new VEG subgoals that estimate the atmospheric effect on reflectance data were built. A new interface that allows the scientist to add techniques to VEG without assistance from the developer was designed and implemented. A prototype Help System that allows the user to get more information about each screen in the VEG interface was also added to VEG.
Inferring relationships between pairs of individuals from locus heterozygosities
Presciuttini, Silvano; Toni, Chiara; Tempestini, Elena; Verdiani, Simonetta; Casarino, Lucia; Spinetti, Isabella; Stefano, Francesco De; Domenici, Ranieri; Bailey-Wilson, Joan E
2002-01-01
Background The traditional exact method for inferring relationships between individuals from genetic data is not easily applicable in all situations that may be encountered in several fields of applied genetics. This study describes an approach that gives affordable results and is easily applicable; it is based on the probabilities that two individuals share 0, 1 or both alleles at a locus identical by state. Results We show that these probabilities (zi) depend on locus heterozygosity (H), and are scarcely affected by variation of the distribution of allele frequencies. This allows us to obtain empirical curves relating zi's to H for a series of common relationships, so that the likelihood ratio of a pair of relationships between any two individuals, given their genotypes at a locus, is a function of a single parameter, H. Application to large samples of mother-child and full-sib pairs shows that the statistical power of this method to infer the correct relationship is not much lower than the exact method. Analysis of a large database of STR data proves that locus heterozygosity does not vary significantly among Caucasian populations, apart from special cases, so that the likelihood ratio of the more common relationships between pairs of individuals may be obtained by looking at tabulated zi values. Conclusions A simple method is provided, which may be used by any scientist with the help of a calculator or a spreadsheet to compute the likelihood ratios of common alternative relationships between pairs of individuals. PMID:12441003
Welding Penetration Control of Fixed Pipe in TIG Welding Using Fuzzy Inference System
NASA Astrophysics Data System (ADS)
Baskoro, Ario Sunar; Kabutomori, Masashi; Suga, Yasuo
This paper presents a study on welding penetration control of fixed pipe in Tungsten Inert Gas (TIG) welding using fuzzy inference system. The welding penetration control is essential to the production quality welds with a specified geometry. For pipe welding using constant arc current and welding speed, the bead width becomes wider as the circumferential welding of small diameter pipes progresses. Having welded pipe in fixed position, obviously, the excessive arc current yields burn through of metals; in contrary, insufficient arc current produces imperfect welding. In order to avoid these errors and to obtain the uniform weld bead over the entire circumference of the pipe, the welding conditions should be controlled as the welding proceeds. This research studies the intelligent welding process of aluminum alloy pipe 6063S-T5 in fixed position using the AC welding machine. The monitoring system used a charge-coupled device (CCD) camera to monitor backside image of molten pool. The captured image was processed to recognize the edge of molten pool by image processing algorithm. Simulation of welding control using fuzzy inference system was constructed to simulate the welding control process. The simulation result shows that fuzzy controller was suitable for controlling the welding speed and appropriate to be implemented into the welding system. A series of experiments was conducted to evaluate the performance of the fuzzy controller. The experimental results show the effectiveness of the control system that is confirmed by sound welds.
Tools for Understanding Identity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Creese, Sadie; Gibson-Robinson, Thomas; Goldsmith, Michael
Identity attribution and enrichment is critical to many aspects of law-enforcement and intelligence gathering; this identity typically spans a number of domains in the natural-world such as biographic information (factual information – e.g. names, addresses), biometric information (e.g. fingerprints) and psychological information. In addition to these natural-world projections of identity, identity elements are projected in the cyber-world. Conversely, undesirable elements may use similar techniques to target individuals for spear-phishing attacks (or worse), and potential targets or their organizations may want to determine how to minimize the attack surface exposed. Our research has been exploring the construction of a mathematical modelmore » for identity that supports such holistic identities. The model captures the ways in which an identity is constructed through a combination of data elements (e.g. a username on a forum, an address, a telephone number). Some of these elements may allow new characteristics to be inferred, hence enriching the holistic view of the identity. An example use-case would be the inference of real names from usernames, the ‘path’ created by inferring new elements of identity is highlighted in the ‘critical information’ panel. Individual attribution exercises can be understood as paths through a number of elements. Intuitively the entire realizable ‘capability’ can be modeled as a directed graph, where the elements are nodes and the inferences are represented by links connecting one or more antecedents with a conclusion. The model can be operationalized with two levels of tool support described in this paper, the first is a working prototype, the second is expected to reach prototype by July 2013: Understanding the Model The tool allows a user to easily determine, given a particular set of inferences and attributes, which elements or inferences are of most value to an investigator (or an attacker). The tool is also able to take into account the difficulty of the inferences, allowing the user to consider different scenarios depending on the perceived resources of the attacker, or to prioritize lines of investigation. It also has a number of interesting visualizations that are designed to aid the user in understanding the model. The tool works by considering the inferences as a graph and runs various graph-theoretic algorithms, with some novel adaptations, in order to deduce various properties. Using the Model To help investigators exploit the model to perform identity attribution, we have developed the Identity Map visualization. For a user-provided set of known starting elements and a set of desired target elements for a given identity, the Identity Map generates investigative workflows as paths through the model. Each path consists of a series of elements and inferences between them that connect the input and output elements. Each path also has an associated confidence level that estimates the reliability of the resulting attribution. Identity Map can help investigators understand the possible ways to make an identification decision and guide them toward the data-collection or analysis steps required to reach that decision.« less
NASA Technical Reports Server (NTRS)
Cox, C.; Au, A.; Klosko, S.; Chao, B.; Smith, David E. (Technical Monitor)
2001-01-01
The upcoming GRACE mission promises to open a window on details of the global mass budget that will have remarkable clarity, but it will not directly answer the question of what the state of the Earth's mass budget is over the critical last quarter of the 20th century. To address that problem we must draw upon existing technologies such as SLR, DORIS, and GPS, and climate modeling runs in order to improve our understanding. Analysis of long-period geopotential changes based on SLR and DORIS tracking has shown that addition of post 1996 satellite tracking data has a significant impact on the recovered zonal rates and long-period tides. Interannual effects such as those causing the post 1996 anomalies must be better characterized before refined estimates of the decadal period changes in the geopotential can be derived from the historical database of satellite tracking. A possible cause of this anomaly is variations in ocean mass distribution, perhaps associated with the recent large El Nino/La Nina. In this study, a low-degree spherical harmonic gravity time series derived from satellite tracking is compared with a TOPEX/POSEIDON-derived sea surface height time series. Corrections for atmospheric mass effects, continental hydrology, snowfall accumulation, and ocean steric model predictions will be considered.
2011-01-01
Background Network inference methods reconstruct mathematical models of molecular or genetic networks directly from experimental data sets. We have previously reported a mathematical method which is exclusively data-driven, does not involve any heuristic decisions within the reconstruction process, and deliveres all possible alternative minimal networks in terms of simple place/transition Petri nets that are consistent with a given discrete time series data set. Results We fundamentally extended the previously published algorithm to consider catalysis and inhibition of the reactions that occur in the underlying network. The results of the reconstruction algorithm are encoded in the form of an extended Petri net involving control arcs. This allows the consideration of processes involving mass flow and/or regulatory interactions. As a non-trivial test case, the phosphate regulatory network of enterobacteria was reconstructed using in silico-generated time-series data sets on wild-type and in silico mutants. Conclusions The new exact algorithm reconstructs extended Petri nets from time series data sets by finding all alternative minimal networks that are consistent with the data. It suggested alternative molecular mechanisms for certain reactions in the network. The algorithm is useful to combine data from wild-type and mutant cells and may potentially integrate physiological, biochemical, pharmacological, and genetic data in the form of a single model. PMID:21762503
Li, Shuying; Zhuang, Jun; Shen, Shifei
2017-07-01
In recent years, various types of terrorist attacks occurred, causing worldwide catastrophes. According to the Global Terrorism Database (GTD), among all attack tactics, bombing attacks happened most frequently, followed by armed assaults. In this article, a model for analyzing and forecasting the conditional probability of bombing attacks (CPBAs) based on time-series methods is developed. In addition, intervention analysis is used to analyze the sudden increase in the time-series process. The results show that the CPBA increased dramatically at the end of 2011. During that time, the CPBA increased by 16.0% in a two-month period to reach the peak value, but still stays 9.0% greater than the predicted level after the temporary effect gradually decays. By contrast, no significant fluctuation can be found in the conditional probability process of armed assault. It can be inferred that some social unrest, such as America's troop withdrawal from Afghanistan and Iraq, could have led to the increase of the CPBA in Afghanistan, Iraq, and Pakistan. The integrated time-series and intervention model is used to forecast the monthly CPBA in 2014 and through 2064. The average relative error compared with the real data in 2014 is 3.5%. The model is also applied to the total number of attacks recorded by the GTD between 2004 and 2014. © 2016 Society for Risk Analysis.
NASA Astrophysics Data System (ADS)
Kretzschmar, Ann; Tych, Wlodek; Beven, Keith; Chappell, Nick
2017-04-01
Flooding is the most widely occurring natural disaster affecting thousands of lives and businesses worldwide each year, and the size and frequency of flood-events are predicted to increase with climate change. The main input-variable for models used in flood prediction is rainfall. Estimating the rainfall input is often based on a sparse network of raingauges, which may or may not be representative of the salient rainfall characteristics responsible for generating of storm-hydrographs. A method based on Reverse Hydrology (Kretzschmar et al 2014 Environ Modell Softw) has been developed and is being tested using the intensively-instrumented Brue catchment (Southwest England) to explore the spatiotemporal structure of the rainfall-field (using 23 rain gauges over the 135.2 km2 basin). We compare how well the rainfall measured at individual gauges, or averaged over the basin, represent the rainfall inferred from the streamflow signal. How important is it to get the detail of the spatiotemporal rainfall structure right? Rainfall is transformed by catchment processes as it moves to streams, so exact duplication of the structure may not be necessary. 'True' rainfall estimated using 23 gauges / 135.2 km2 is likely to be a good estimate of the overall-catchment-rainfall, however, the integration process 'smears' the rainfall patterns in time, i.e. reduces the number of and lengthens rain-events as they travel across the catchment. This may have little impact on the simulation of stream-hydrographs when events are extensive across the catchment (e.g., frontal rainfall events) but may be significant for high-intensity, localised convective events. The Reverse Hydrology approach uses the streamflow record to infer a rainfall sequence with a lower time-resolution than the original input time-series. The inferred rainfall series is, however, able simulate streamflow as well as the observed, high resolution rainfall (Kretzschmar et al 2015 Hydrol Res). Most gauged catchments in the UK of a similar size would only have data available for 1 to 3 raingauges. The high density of the Brue raingauge network allows a good estimate of the 'True' catchment rainfall to be made and compared with data from an individual raingauge as if that was the only data available. In addition the rainfall from each raingauge is compared with rainfall inferred from streamflow using data from the selected individual raingauge, and also inferred from the full catchment network. The stochastic structure of the rainfall from all of these datasets is compared using a combination of traditional statistical measures, i.e., the first 4 moments of rainfall totals and its residuals; plus the number, length and distribution of wet and dry periods; rainfall intensity characteristics; and their ability to generate the observed stream hydrograph. Reverse Hydrology, which utilises information present in both the input rainfall and the output hydrograph, has provided a method of investigating the quality of the information each gauge adds to the catchment-average (Kretzschmar et al 2016 Procedia Eng.). Further, it has been used to ascertain how important reproducing the detailed rainfall structure really is, when used for flow prediction.
Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping
2013-01-01
Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach. Findings from this proof-of-concept study suggest that our approach has a great potential in providing a novel and sensitive tool for threshold setting in chemical risk assessment. In future work, we plan to analyze more time-series datasets with a full spectrum of concentrations and sufficient replications per treatment. The pathway alteration-derived thresholds will also be compared with those derived from apical endpoints such as cell growth rate.
Causal Analysis of Self-tracked Time Series Data Using a Counterfactual Framework for N-of-1 Trials.
Daza, Eric J
2018-02-01
Many of an individual's historically recorded personal measurements vary over time, thereby forming a time series (e.g., wearable-device data, self-tracked fitness or nutrition measurements, regularly monitored clinical events or chronic conditions). Statistical analyses of such n-of-1 (i.e., single-subject) observational studies (N1OSs) can be used to discover possible cause-effect relationships to then self-test in an n-of-1 randomized trial (N1RT). However, a principled way of determining how and when to interpret an N1OS association as a causal effect (e.g., as if randomization had occurred) is needed.Our goal in this paper is to help bridge the methodological gap between risk-factor discovery and N1RT testing by introducing a basic counterfactual framework for N1OS design and personalized causal analysis.We introduce and characterize what we call the average period treatment effect (APTE), i.e., the estimand of interest in an N1RT, and build an analytical framework around it that can accommodate autocorrelation and time trends in the outcome, effect carryover from previous treatment periods, and slow onset or decay of the effect. The APTE is loosely defined as a contrast (e.g., difference, ratio) of averages of potential outcomes the individual can theoretically experience under different treatment levels during a given treatment period. To illustrate the utility of our framework for APTE discovery and estimation, two common causal inference methods are specified within the N1OS context. We then apply the framework and methods to search for estimable and interpretable APTEs using six years of the author's self-tracked weight and exercise data, and report both the preliminary findings and the challenges we faced in conducting N1OS causal discovery.Causal analysis of an individual's time series data can be facilitated by an N1RT counterfactual framework. However, for inference to be valid, the veracity of certain key assumptions must be assessed critically, and the hypothesized causal models must be interpretable and meaningful. Schattauer GmbH.
Pollitz, Fred
2015-01-01
I reexamine the lower crust and mantle relaxation following two large events in the Mojave Desert: the 1992 M7.3 Landers and 1999 M7.1 Hector Mine, California, earthquakes. Time series from continuous GPS sites out to 300 km from the ruptures are used to constrain models of postseismic relaxation. Crustal motions in the Mojave Desert region are elevated above background for several years following each event. To account for broadscale relaxation of the lower crust and mantle, the Burgers body model is employed, involving transient and steady state viscosities. Joint afterslip/postseismic relaxation modeling of the GPS time series up to one decade following the Hector Mine earthquake reveals a significant rheological contrast between a northwest trending “southwest domain” (that envelopes the San Andreas fault system and western Mojave Desert) and an adjacent “northeast domain” (that envelopes the Landers and Hector Mine rupture areas in the central Mojave Desert). The steady state viscosity of the northeast domain mantle asthenosphere is inferred to be ∼4 times greater than that of the southwest domain. This pattern is counter to that expected for regional heat flow, which is higher in the northeast domain, but it is explicable by means of a nonlinear rheology that includes dependence on both strain rate and water concentration. I infer that the southwest domain mantle has a relatively low steady state viscosity because of its high strain rate and water content. The relatively low mantle water content of the northeast domain is interpreted to result from the continual extraction of water through igneous and volcanic activity over the past ∼20 Myr.
Ludington, Steve; Plumlee, Geoff; Caine, Jonathan S.; Bove, Dana; Holloway, JoAnn; Livo, Eric
2005-01-01
Introduction: This report is one in a series that presents results of an interdisciplinary U.S. Geological Survey (USGS) study of ground-water quality in the lower Red River watershed prior to open-pit and underground molybdenite mining at Molycorp's Questa mine. The stretch of the Red River watershed that extends from just upstream of the town of Red River, N. Mex., to just above the town of Questa includes several mineralized areas in addition to the one mined by Molycorp. Natural erosion and weathering of pyrite-rich rocks in the mineralized areas has created a series of erosional scars along this stretch of the Red River that contribute acidic waters, as well as mineralized alluvial material and sediments, to the river. The overall goal of the USGS study is to infer the premining ground-water quality at the Molycorp mine site. An integrated geologic, hydrologic, and geochemical model for ground water in the mineralized-but unmined-Straight Creek drainage (a tributary of the Red River) is being used as an analog for the geologic, geochemical, and hydrologic conditions that influenced ground-water quality and quantity in the Red River drainage prior to mining. This report provides an overall geologic framework for the Red River watershed between Red River and Questa, in northern New Mexico, and summarizes key geologic, mineralogic, structural and other characteristics of various mineralized areas (and their associated erosional scars and debris fans) that likely influence ground- and surface-water quality and hydrology. The premining nature of the Sulphur Gulch and Goat Hill Gulch scars on the Molycorp mine site can be inferred through geologic comparisons with other unmined scars in the Red River drainage.
Asteroseismology: Data Analysis Methods and Interpretation for Space and Ground-based Facilities
NASA Astrophysics Data System (ADS)
Campante, T. L.
2012-06-01
This dissertation has been submitted to the Faculdade de Ciências da Universidade do Porto in partial fulfillment of the requirements for the PhD degree in Astronomy. The scientific results presented herein follow from the research activity performed under the supervision of Dr. Mário João Monteiro at the Centro de Astrofísica da Universidade do Porto and Dr. Hans Kjeldsen at the Institut for Fysik og Astronomi, Aarhus Universitet. The dissertation is composed of three chapters and a list of appendices. Chapter 1 serves as an unpretentious and rather general introduction to the field of asteroseismology of solar-like stars. It starts with a historical account of the field of asteroseismology followed by a general review of the basic physics and properties of stellar pulsations. Emphasis is then naturally placed on the stochastic excitation of stellar oscillations and on the potential of asteroseismic inference. The chapter closes with a discussion about observational techniques and the observational status of the field. Chapter 2 is devoted to the subject of data analysis in asteroseismology. This is an extensive subject, therefore, a compilation is presented of the relevant data analysis methods and techniques employed contemporarily in asteroseismology of solar-like stars. Special attention has been drawn to the subject of statistical inference both from the competing Bayesian and frequentist perspectives. The chapter ends with a description of the implementation of a pipeline for mode parameter analysis of Kepler data. In the course of these two first chapters, reference is made to a series of articles led by the author (or otherwise having greatly benefited from his contribution) that can be found in Appendices A to E. Chapter 3 then goes on to present a series of additional published results.
NASA Astrophysics Data System (ADS)
Ortlieb, Luc; Zazo, Cari; Goy, JoséLuis; Hillaire-Marcel, Claude; Ghaleb, Bassam; Cournoyer, Louise
The Nazca-South American plate boundary is a subduction zone where a relatively complex pattern of vertical deformation can be inferred from the study of emerged marine terraces. Along the coasts of southern Peru and northern Chile, the vertical distribution of remnants of Pleistocene terraces suggests that a crustal, large scale uplift motion is combined with more regional/local tectonic processes. In northern Chile, the area of Hornitos (23°S) offers a remarkable sequence of well-defined marine terraces that may be dated through U-series and aminostratigraphic studies on mollusc shells. The unusual preservation of the landforms and of the shell material, which enabled the age determination of the deposits, is largely due to the lengthy history of extreme aridity in this area. The exceptional record of late Middle Pleistocene to Late Pleistocene high seastands is also favoured by the slight warping of two distinct fault blocks that have enhanced the morphostratigraphic relationships between the distinct coastal units. Detailed geomorphological, sedimentological and chronostratigraphic studies of the Hornitos area led to the identification, with reasonable confidence, of the depositional remnants of sea-level maxima coeval with the Oxygen Isotope Substages 5c, 5e, 7 (probably two episodes) and the isotope stage 9 (series of beach ridges). The coastal plain, at the foot of the major Coastal Escarpment of northern Chile, appears to have been uplifted at a mean rate of 240 mm/ky in the course of the last 330 ky. From the elevation of the older terraces and late Pliocene shorelines, it can be inferred that these steady vertical motions were much more rapid than during the Early Pleistocene.
Multi-locus analysis of genomic time series data from experimental evolution.
Terhorst, Jonathan; Schlötterer, Christian; Song, Yun S
2015-04-01
Genomic time series data generated by evolve-and-resequence (E&R) experiments offer a powerful window into the mechanisms that drive evolution. However, standard population genetic inference procedures do not account for sampling serially over time, and new methods are needed to make full use of modern experimental evolution data. To address this problem, we develop a Gaussian process approximation to the multi-locus Wright-Fisher process with selection over a time course of tens of generations. The mean and covariance structure of the Gaussian process are obtained by computing the corresponding moments in discrete-time Wright-Fisher models conditioned on the presence of a linked selected site. This enables our method to account for the effects of linkage and selection, both along the genome and across sampled time points, in an approximate but principled manner. We first use simulated data to demonstrate the power of our method to correctly detect, locate and estimate the fitness of a selected allele from among several linked sites. We study how this power changes for different values of selection strength, initial haplotypic diversity, population size, sampling frequency, experimental duration, number of replicates, and sequencing coverage depth. In addition to providing quantitative estimates of selection parameters from experimental evolution data, our model can be used by practitioners to design E&R experiments with requisite power. We also explore how our likelihood-based approach can be used to infer other model parameters, including effective population size and recombination rate. Then, we apply our method to analyze genome-wide data from a real E&R experiment designed to study the adaptation of D. melanogaster to a new laboratory environment with alternating cold and hot temperatures.
Age model for a continuous, ca 250-ka Quaternary lacustrine record from Bear Lake, Utah-Idaho
Colman, Steven M.; Kaufman, D.S.; Bright, Jordon; Heil, C.; King, J.W.; Dean, W.E.; Rosenbaum, J.G.; Forester, R.M.; Bischoff, J.L.; Perkins, Marie; McGeehin, J.P.
2006-01-01
The Quaternary sediments sampled by continuous 120-m-long drill cores from Bear Lake (Utah-Idaho) comprise one of the longest lacustrine sequences recovered from an extant lake. The cores serve as a good case study for the construction of an age model for sequences that extend beyond the range of radiocarbon dating. From a variety of potential age indicators, we selected a combination of radiocarbon ages, one magnetic excursion (correlated to a standard sequence), and a single Uranium-series age to develop an initial data set. The reliability of the excursion and U-series data require consideration of their position with respect to sediments of inferred interglacial character, but not direct correlation with other paleoclimate records. Data omitted from the age model include amino acid age estimates, which have a large amount of scatter, and tephrochronology correlations, which have relatively large uncertainties. Because the initial data set was restricted to the upper half of the BL00-1 core, we inferred additional ages by direct correlation to the independently dated paleoclimate record from Devils Hole. We developed an age model for the entire core using statistical methods that consider both the uncertainties of the original data and that of the curve-fitting process, with a combination of our initial data set and the climate correlations as control points. This age model represents our best estimate of the chronology of deposition in Bear Lake. Because the age model contains assumptions about the correlation of Bear Lake to other climate records, the model cannot be used to address some paleoclimate questions, such as phase relationships with other areas.
NASA Astrophysics Data System (ADS)
Lutsch, E.; Conway, S. A.; Strong, K.; Jones, D. B. A.; Drummond, J. R.; Ortega, I.; Hannigan, J. W.; Makarova, M.; Notholt, J.; Blumenstock, T.; Sussmann, R.; Mahieu, E.; Kasai, Y.; Clerbaux, C.
2017-12-01
We present a multi-year time series of the total columns of carbon monoxide (CO), hydrogen cyanide (HCN) and ethane (C2H6) obtained by Fourier Transform Infrared (FTIR) spectrometer measurements at nine sites. Six are high-latitude sites: Eureka, Nunavut; Ny Alesund, Norway; Thule, Greenland; Kiruna, Sweden; Poker Flat, Alaska and St. Petersburg, Russia and three are mid-latitude sites; Zugspitze, Germany; Jungfraujoch, Switzerland and Toronto, Ontario. For each site, the inter-annual trends and seasonal variabilities of the CO total column time series are accounted for, allowing ambient concentrations to be determined. Enhancements above ambient levels are then used to identify possible wildfire pollution events. Since the abundance of each trace gas species emitted in a wildfire event is specific to the type of vegetation burned and the burning phase, correlations of CO to the other long-lived wildfire tracers HCN and C2H6 allow for further confirmation of the detection of wildfire pollution. Back-trajectories from HYSPLIT and FLEXPART as well as fire detections from the Moderate Resolution Spectroradiometer (MODIS) allow the source regions of the detected enhancements to be determined while satellite observations of CO from the Measurement of Pollution in the Troposphere (MOPITT) and Infrared Atmospheric Sounding Interferometer (IASI) instruments can be used to track the transport of the smoke plume. Differences in travel times between sites allows ageing of biomass burning plumes to be determined, providing a means to infer the physical and chemical processes affecting the loss of each species during transport. Comparisons of ground-based FTIR measurements to GEOS-Chem chemical transport model results are used to investigate these processes, evaluate wildfire emission inventories and infer the influence of wildfire emissions on the Arctic.
NASA Astrophysics Data System (ADS)
Mullan, Donal; Chen, Jie; Zhang, Xunchang John
2016-02-01
Statistical downscaling (SD) methods have become a popular, low-cost and accessible means of bridging the gap between the coarse spatial resolution at which climate models output climate scenarios and the finer spatial scale at which impact modellers require these scenarios, with various different SD techniques used for a wide range of applications across the world. This paper compares the Generator for Point Climate Change (GPCC) model and the Statistical DownScaling Model (SDSM)—two contrasting SD methods—in terms of their ability to generate precipitation series under non-stationary conditions across ten contrasting global climates. The mean, maximum and a selection of distribution statistics as well as the cumulative frequencies of dry and wet spells for four different temporal resolutions were compared between the models and the observed series for a validation period. Results indicate that both methods can generate daily precipitation series that generally closely mirror observed series for a wide range of non-stationary climates. However, GPCC tends to overestimate higher precipitation amounts, whilst SDSM tends to underestimate these. This infers that GPCC is more likely to overestimate the effects of precipitation on a given impact sector, whilst SDSM is likely to underestimate the effects. GPCC performs better than SDSM in reproducing wet and dry day frequency, which is a key advantage for many impact sectors. Overall, the mixed performance of the two methods illustrates the importance of users performing a thorough validation in order to determine the influence of simulated precipitation on their chosen impact sector.
Xiong, Jie; Zhou, Tong
2012-01-01
An important problem in systems biology is to reconstruct gene regulatory networks (GRNs) from experimental data and other a priori information. The DREAM project offers some types of experimental data, such as knockout data, knockdown data, time series data, etc. Among them, multifactorial perturbation data are easier and less expensive to obtain than other types of experimental data and are thus more common in practice. In this article, a new algorithm is presented for the inference of GRNs using the DREAM4 multifactorial perturbation data. The GRN inference problem among [Formula: see text] genes is decomposed into [Formula: see text] different regression problems. In each of the regression problems, the expression level of a target gene is predicted solely from the expression level of a potential regulation gene. For different potential regulation genes, different weights for a specific target gene are constructed by using the sum of squared residuals and the Pearson correlation coefficient. Then these weights are normalized to reflect effort differences of regulating distinct genes. By appropriately choosing the parameters of the power law, we constructe a 0-1 integer programming problem. By solving this problem, direct regulation genes for an arbitrary gene can be estimated. And, the normalized weight of a gene is modified, on the basis of the estimation results about the existence of direct regulations to it. These normalized and modified weights are used in queuing the possibility of the existence of a corresponding direct regulation. Computation results with the DREAM4 In Silico Size 100 Multifactorial subchallenge show that estimation performances of the suggested algorithm can even outperform the best team. Using the real data provided by the DREAM5 Network Inference Challenge, estimation performances can be ranked third. Furthermore, the high precision of the obtained most reliable predictions shows the suggested algorithm may be helpful in guiding biological experiment designs.
Snowden, Jonathan M; Tilden, Ellen L; Odden, Michelle C
2018-06-08
In this article, we conclude our 3-part series by focusing on several concepts that have proven useful for formulating causal questions and inferring causal effects. The process of causal inference is of key importance for physiologic childbirth science, so each concept is grounded in content related to women at low risk for perinatal complications. A prerequisite to causal inference is determining that the question of interest is causal rather than descriptive or predictive. Another critical step in defining a high-impact causal question is assessing the state of existing research for evidence of causality. We introduce 2 causal frameworks that are useful for this undertaking, Hill's causal considerations and the sufficient-component cause model. We then provide 3 steps to aid perinatal researchers in inferring causal effects in a given study. First, the researcher should formulate a rigorous and clear causal question. We introduce an example of epidural analgesia and labor progression to demonstrate this process, including the central role of temporality. Next, the researcher should assess the suitability of the given data set to answer this causal question. In randomized controlled trials, data are collected with the express purpose of answering the causal question. Investigators using observational data should also ensure that their chosen causal question is answerable with the available data. Finally, investigators should design an analysis plan that targets the causal question of interest. Some data structures (eg, time-dependent confounding by labor progress when estimating the effect of epidural analgesia on postpartum hemorrhage) require specific analytical tools to control for bias and estimate causal effects. The assumptions of consistency, exchangeability, and positivity may be especially useful in carrying out these steps. Drawing on appropriate causal concepts and considering relevant assumptions strengthens our confidence that research has reduced the likelihood of alternative explanations (eg bias, chance) and estimated a causal effect. © 2018 by the American College of Nurse-Midwives.
NASA Astrophysics Data System (ADS)
Harp, A.; Valentine, G.
2016-12-01
Mafic eruptions along the flanks of stratovolcanoes pose significant hazards to life and property due to the uncertainty linked to new vent locations and their potentially close proximity to inhabited areas. Flank eruptions are often fed by radial dikes with magma supplied either laterally from the central conduit or vertically from a deeper storage location. The highly eroded Oligocene age Summer Coon stratovolcano, Colorado reveals over 700 mafic dikes surrounding a series of intrusive stocks (inferred conduit). The exposure provides an opportunity to study radial dike propagation directions and their relationship with the conduit in the lower portions of a volcanic edifice. Detailed geologic mapping and a geophysical survey revealed that little or no direct connection exists between the mafic radial dikes and the inferred conduit at the current level of exposure. Oriented samples collected from the chilled margins of 29 mafic dikes were analyzed for flow fabrics and emplacement directions. Among them, 20 dikes show flow angles greater than 30 degrees from horizontal, and a single dike had flow fabrics oriented at approximately 20 degrees. Of the dikes with steeper fabrics nine dikes were emplaced up and toward the volcano's center between 30-75 degrees from horizontal, and 11 dikes emplaced up and away from the volcano's center between 35-60 degrees. The two groups of dikes likely responded to the stress field within the edifice, where steepest-emplaced had relatively high magma overpressure and were focused toward the volcano's summit, while dikes with lower overpressures propagated out toward the flanks. At Summer Coon, the lack of connection between mafic dikes and the inferred conduit and presence of only one sub-horizontally emplaced dike implies the stresses within lower edifice impeded lateral dike nucleation and propagation while promoting and influencing the emplacement direction of upward propagating dikes.
Smith, Justin D.; Borckardt, Jeffrey J.; Nash, Michael R.
2013-01-01
The case-based time-series design is a viable methodology for treatment outcome research. However, the literature has not fully addressed the problem of missing observations with such autocorrelated data streams. Mainly, to what extent do missing observations compromise inference when observations are not independent? Do the available missing data replacement procedures preserve inferential integrity? Does the extent of autocorrelation matter? We use Monte Carlo simulation modeling of a single-subject intervention study to address these questions. We find power sensitivity to be within acceptable limits across four proportions of missing observations (10%, 20%, 30%, and 40%) when missing data are replaced using the Expectation-Maximization Algorithm, more commonly known as the EM Procedure (Dempster, Laird, & Rubin, 1977).This applies to data streams with lag-1 autocorrelation estimates under 0.80. As autocorrelation estimates approach 0.80, the replacement procedure yields an unacceptable power profile. The implications of these findings and directions for future research are discussed. PMID:22697454
NASA Technical Reports Server (NTRS)
Hailperin, Max
1993-01-01
This thesis provides design and analysis of techniques for global load balancing on ensemble architectures running soft-real-time object-oriented applications with statistically periodic loads. It focuses on estimating the instantaneous average load over all the processing elements. The major contribution is the use of explicit stochastic process models for both the loading and the averaging itself. These models are exploited via statistical time-series analysis and Bayesian inference to provide improved average load estimates, and thus to facilitate global load balancing. This thesis explains the distributed algorithms used and provides some optimality results. It also describes the algorithms' implementation and gives performance results from simulation. These results show that our techniques allow more accurate estimation of the global system load ing, resulting in fewer object migration than local methods. Our method is shown to provide superior performance, relative not only to static load-balancing schemes but also to many adaptive methods.
Hopke, P K; Liu, C; Rubin, D B
2001-03-01
Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.
Selection, adaptation, and predictive information in changing environments
NASA Astrophysics Data System (ADS)
Feltgen, Quentin; Nemenman, Ilya
2014-03-01
Adaptation by means of natural selection is a key concept in evolutionary biology. Individuals better matched to the surrounding environment outcompete the others. This increases the fraction of the better adapted individuals in the population, and hence increases its collective fitness. Adaptation is also prominent on the physiological scale in neuroscience and cell biology. There each individual infers properties of the environment and changes to become individually better, improving the overall population as well. Traditionally, these two notions of adaption have been considered distinct. Here we argue that both types of adaptation result in the same population growth in a broad class of analytically tractable population dynamics models in temporally changing environments. In particular, both types of adaptation lead to subextensive corrections to the population growth rates. These corrections are nearly universal and are equal to the predictive information in the environment time series, which is also the characterization of the time series complexity. This work has been supported by the James S. McDonnell Foundation.
Inferring the interplay between network structure and market effects in Bitcoin
NASA Astrophysics Data System (ADS)
Kondor, Dániel; Csabai, István; Szüle, János; Pósfai, Márton; Vattay, Gábor
2014-12-01
A main focus in economics research is understanding the time series of prices of goods and assets. While statistical models using only the properties of the time series itself have been successful in many aspects, we expect to gain a better understanding of the phenomena involved if we can model the underlying system of interacting agents. In this article, we consider the history of Bitcoin, a novel digital currency system, for which the complete list of transactions is available for analysis. Using this dataset, we reconstruct the transaction network between users and analyze changes in the structure of the subgraph induced by the most active users. Our approach is based on the unsupervised identification of important features of the time variation of the network. Applying the widely used method of Principal Component Analysis to the matrix constructed from snapshots of the network at different times, we are able to show how structural changes in the network accompany significant changes in the exchange price of bitcoins.
Signatures of ecological processes in microbial community time series.
Faust, Karoline; Bauchinger, Franziska; Laroche, Béatrice; de Buyl, Sophie; Lahti, Leo; Washburne, Alex D; Gonze, Didier; Widder, Stefanie
2018-06-28
Growth rates, interactions between community members, stochasticity, and immigration are important drivers of microbial community dynamics. In sequencing data analysis, such as network construction and community model parameterization, we make implicit assumptions about the nature of these drivers and thereby restrict model outcome. Despite apparent risk of methodological bias, the validity of the assumptions is rarely tested, as comprehensive procedures are lacking. Here, we propose a classification scheme to determine the processes that gave rise to the observed time series and to enable better model selection. We implemented a three-step classification scheme in R that first determines whether dependence between successive time steps (temporal structure) is present in the time series and then assesses with a recently developed neutrality test whether interactions between species are required for the dynamics. If the first and second tests confirm the presence of temporal structure and interactions, then parameters for interaction models are estimated. To quantify the importance of temporal structure, we compute the noise-type profile of the community, which ranges from black in case of strong dependency to white in the absence of any dependency. We applied this scheme to simulated time series generated with the Dirichlet-multinomial (DM) distribution, Hubbell's neutral model, the generalized Lotka-Volterra model and its discrete variant (the Ricker model), and a self-organized instability model, as well as to human stool microbiota time series. The noise-type profiles for all but DM data clearly indicated distinctive structures. The neutrality test correctly classified all but DM and neutral time series as non-neutral. The procedure reliably identified time series for which interaction inference was suitable. Both tests were required, as we demonstrated that all structured time series, including those generated with the neutral model, achieved a moderate to high goodness of fit to the Ricker model. We present a fast and robust scheme to classify community structure and to assess the prevalence of interactions directly from microbial time series data. The procedure not only serves to determine ecological drivers of microbial dynamics, but also to guide selection of appropriate community models for prediction and follow-up analysis.
High speed magnetized flows in the quiet Sun
NASA Astrophysics Data System (ADS)
Quintero Noda, C.; Borrero, J. M.; Orozco Suárez, D.; Ruiz Cobo, B.
2014-09-01
Context. We analyzed spectropolarimetric data recorded with Hinode/SP in quiet-Sun regions located at the disk center. We found single-lobed Stokes V profiles showing highly blue- and red-shifted signals. Oftentimes both types of events appear to be related to each other. Aims: We aim to set constraints on the nature and physical causes of these highly Doppler-shifted signals, as well as to study their spatial distribution, spectropolarimetric properties, size, and rate of occurrence. Also, we plan to retrieve the variation of the physical parameters with optical depth through the photosphere. Methods: We have examined the spatial and polarimetric properties of these events using a variety of data from the Hinode spacecraft. We have also inferred the atmospheric stratification of the physical parameters by means of the inversion of the observed Stokes profiles employing the Stokes Inversion based on Response functions (SIR) code. Finally, we analyzed their evolution using a time series from the same instrument. Results: Blue-shifted events tend to appear over bright regions at the edge of granules, while red-shifted events are seen predominantly over dark regions on intergranular lanes. Large linear polarization signals can be seen in the region that connects them. The magnetic structure inferred from the time series revealed that the structure corresponds to a Ω-loop, with one footpoint always over the edge of a granule and the other inside an intergranular lane. The physical parameters obtained from the inversions of the observed Stokes profiles in both events show an increase with respect to the Harvard-Smithonian reference atmosphere in the temperature at log τ500 ∈ (-1, -3) and a strong magnetic field, B ≥ 1 kG, at the bottom of the atmosphere that quickly decreases upward until vanishing at log τ500 ≈ -2. In the blue-shifted events, the LOS velocities change from upflows at the bottom to downflows at the top of the atmosphere. Red-shifted events display the opposite velocity stratification. The change of sign in LOS velocity happens at the same optical depth in which the magnetic field becomes zero. Conclusions: The physical mechanism that best explains the inferred magnetic field configuration and flow motions is a siphon flow along an arched magnetic flux tube. Further investigation is required, however, as the expected features of a siphon flow cannot be unequivocally identified.
Determination of hydrogen/deuterium ratio with neutron measurements on MAST
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klimek, I., E-mail: iwona.klimek@physics.uu.se; Cecconello, M.; Ericsson, G.
2014-11-15
On MAST, compressional Alfvén eigenmodes can be destabilized by the presence of a sufficiently large population of energetic particles in the plasma. This dependence was studied in a series of very similar discharges in which increasing amounts of hydrogen were puffed into a deuterium plasma. A simple method to estimate the isotopic ratio n{sub H}/n{sub D} using neutron emission measurements is here described. The inferred isotopic ratio ranged from 0.0 to 0.6 and no experimental indication of changes in radial profile of n{sub H}/n{sub D} were observed. These findings are confirmed by TRANSP/NUBEAM simulations of the neutron emission.
Exploring bird aerodynamics using radio-controlled models.
Hoey, Robert G
2010-12-01
A series of radio-controlled glider models was constructed by duplicating the aerodynamic shape of soaring birds (raven, turkey vulture, seagull and pelican). Controlled tests were conducted to determine the level of longitudinal and lateral-directional static stability, and to identify the characteristics that allowed flight without a vertical tail. The use of tail-tilt for controlling small bank-angle changes, as observed in soaring birds, was verified. Subsequent tests, using wing-tip ailerons, inferred that birds use a three-dimensional flow pattern around the wing tip (wing tip vortices) to control adverse yaw and to create a small amount of forward thrust in gliding flight.
Permanent draft genomes of the two Rhodopirellula europaea strains 6C and SH398.
Richter-Heitmann, Tim; Richter, Michael; Klindworth, Anna; Wegner, Carl-Eric; Frank, Carsten S; Glöckner, Frank Oliver; Harder, Jens
2014-02-01
The genomes of two Rhodopirellula europaea strains were sequenced as permanent drafts to study the genomic diversity within this genus, especially in comparison with the closed genome of the type strain Rhodopirellula baltica SH1(T). The isolates are part of a larger study to infer the biogeography of Rhodopirellula species in European marine waters, as well as to amend the genus description of R. baltica. This genomics resource article is the second of a series of five publications describing a total of eight new permanent daft genomes of Rhodopirellula species. Copyright © 2013 Elsevier B.V. All rights reserved.
Measuring the uncertainty of coupling
NASA Astrophysics Data System (ADS)
Zhao, Xiaojun; Shang, Pengjian
2015-06-01
A new information-theoretic measure, called coupling entropy, is proposed here to detect the causal links in complex systems by taking into account the inner composition alignment of temporal structure. It is a permutation-based asymmetric association measure to infer the uncertainty of coupling between two time series. The coupling entropy is found to be effective in the analysis of Hénon maps, where different noises are added to test its accuracy and sensitivity. The coupling entropy is also applied to analyze the relationship between unemployment rate and CPI change in the U.S., where the CPI change turns out to be the driving variable while the unemployment rate is the responding one.
Counterfactual Reasoning in Non-psychotic First-Degree Relatives of People with Schizophrenia
Albacete, Auria; Contreras, Fernando; Bosque, Clara; Gilabert, Ester; Albiach, Ángela; Menchón, José M.; Crespo-Facorro, Benedicto; Ayesa-Arriola, Rosa
2016-01-01
Counterfactual thinking (CFT) is a type of conditional reasoning that enables the generation of mental simulations of alternatives to past factual events. Previous research has found this cognitive feature to be disrupted in schizophrenia (Hooker et al., 2000; Contreras et al., 2016). At the same time, the study of cognitive deficits in unaffected relatives of people with schizophrenia has significantly increased, supporting its potential endophenotypic role in this disorder. Using an exploratory approach, the current study examined CFT for the first time in a sample of non-psychotic first-degree relatives of schizophrenia patients (N = 43), in comparison with schizophrenia patients (N = 54) and healthy controls (N = 44). A series of tests that assessed the “causal order effect” in CFT and the ability to generate counterfactual thoughts and counterfactually derive inferences using the Counterfactual Inference Test was completed. Associations with variables of basic and social cognition, levels of schizotypy and psychotic-like experiences in addition to clinical and socio-demographic characteristics were also explored. Findings showed that first-degree relatives generated a lower number of counterfactual thoughts than controls, and were more adept at counterfactually deriving inferences, specifically in the scenarios related to regret and to judgments of avoidance in an unusual situation. No other significant results were found. These preliminary findings suggest that non-psychotic first-degree relatives of schizophrenia patients show a subtle disruption of global counterfactual thinking compared with what is normally expected in the general population. Due to the potential impact of such deficits, new treatments targeting CFT improvement might be considered in future management strategies. PMID:27242583
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nessi, G. T. von; Hole, M. J.; Svensson, J.
2012-01-15
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlledmore » Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.« less
fastBMA: scalable network inference and transitive reduction.
Hung, Ling-Hong; Shi, Kaiyuan; Wu, Migao; Young, William Chad; Raftery, Adrian E; Yeung, Ka Yee
2017-10-01
Inferring genetic networks from genome-wide expression data is extremely demanding computationally. We have developed fastBMA, a distributed, parallel, and scalable implementation of Bayesian model averaging (BMA) for this purpose. fastBMA also includes a computationally efficient module for eliminating redundant indirect edges in the network by mapping the transitive reduction to an easily solved shortest-path problem. We evaluated the performance of fastBMA on synthetic data and experimental genome-wide time series yeast and human datasets. When using a single CPU core, fastBMA is up to 100 times faster than the next fastest method, LASSO, with increased accuracy. It is a memory-efficient, parallel, and distributed application that scales to human genome-wide expression data. A 10 000-gene regulation network can be obtained in a matter of hours using a 32-core cloud cluster (2 nodes of 16 cores). fastBMA is a significant improvement over its predecessor ScanBMA. It is more accurate and orders of magnitude faster than other fast network inference methods such as the 1 based on LASSO. The improved scalability allows it to calculate networks from genome scale data in a reasonable time frame. The transitive reduction method can improve accuracy in denser networks. fastBMA is available as code (M.I.T. license) from GitHub (https://github.com/lhhunghimself/fastBMA), as part of the updated networkBMA Bioconductor package (https://www.bioconductor.org/packages/release/bioc/html/networkBMA.html) and as ready-to-deploy Docker images (https://hub.docker.com/r/biodepot/fastbma/). © The Authors 2017. Published by Oxford University Press.
Counterfactual Reasoning in Non-psychotic First-Degree Relatives of People with Schizophrenia.
Albacete, Auria; Contreras, Fernando; Bosque, Clara; Gilabert, Ester; Albiach, Ángela; Menchón, José M; Crespo-Facorro, Benedicto; Ayesa-Arriola, Rosa
2016-01-01
Counterfactual thinking (CFT) is a type of conditional reasoning that enables the generation of mental simulations of alternatives to past factual events. Previous research has found this cognitive feature to be disrupted in schizophrenia (Hooker et al., 2000; Contreras et al., 2016). At the same time, the study of cognitive deficits in unaffected relatives of people with schizophrenia has significantly increased, supporting its potential endophenotypic role in this disorder. Using an exploratory approach, the current study examined CFT for the first time in a sample of non-psychotic first-degree relatives of schizophrenia patients (N = 43), in comparison with schizophrenia patients (N = 54) and healthy controls (N = 44). A series of tests that assessed the "causal order effect" in CFT and the ability to generate counterfactual thoughts and counterfactually derive inferences using the Counterfactual Inference Test was completed. Associations with variables of basic and social cognition, levels of schizotypy and psychotic-like experiences in addition to clinical and socio-demographic characteristics were also explored. Findings showed that first-degree relatives generated a lower number of counterfactual thoughts than controls, and were more adept at counterfactually deriving inferences, specifically in the scenarios related to regret and to judgments of avoidance in an unusual situation. No other significant results were found. These preliminary findings suggest that non-psychotic first-degree relatives of schizophrenia patients show a subtle disruption of global counterfactual thinking compared with what is normally expected in the general population. Due to the potential impact of such deficits, new treatments targeting CFT improvement might be considered in future management strategies.
Symptomatic Remission and Counterfactual Reasoning in Schizophrenia.
Albacete, Auria; Contreras, Fernando; Bosque, Clara; Gilabert, Ester; Albiach, Ángela; Menchón, José M
2016-01-01
Counterfactual thinking (CFT) is a type of conditional reasoning involving mental representations of alternatives to past factual events that previous preliminary research has suggested to be impaired in schizophrenia. However, despite the potential impact of these deficits on the functional outcome of these patients, studies examining the role of CFT in this disorder are still few in number. The present study aimed to extent previous results by evaluating CFT in the largest sample to date of schizophrenia patients in symptomatic remission and healthy controls. The relationship with symptomatology, illness duration, and sociodemographic characteristics was also explored. Methods: Seventy-eight schizophrenia patients and 84 healthy controls completed a series of tests that examined the generation of counterfactual thoughts, the influence of the "causal order effect," and the ability to counterfactually derive inferences by using de Counterfactual Inference Test. Results: Compared with controls, patients generated fewer counterfactual thoughts when faced with a simulated scenario. This deficit was negatively related to scores on all dimensions of the Positive and Negative Syndrome Scale-PANNS, as well as to longer illness duration. The results also showed that schizophrenia patients deviated significantly from the normative pattern when generating inferences from CFT. Conclusions: These findings reveal CFT impairment to be present in schizophrenia even when patients are in symptomatic remission. However, symptomatology and illness duration may have a negative influence on these patients' ability to generate counterfactual thoughts. The results might support the relevance of targeting CFT in future treatment approaches, although further research is needed to better describe the relationship between CFT and both symptomatology and functional outcome.
Symptomatic Remission and Counterfactual Reasoning in Schizophrenia
Albacete, Auria; Contreras, Fernando; Bosque, Clara; Gilabert, Ester; Albiach, Ángela; Menchón, José M.
2017-01-01
Counterfactual thinking (CFT) is a type of conditional reasoning involving mental representations of alternatives to past factual events that previous preliminary research has suggested to be impaired in schizophrenia. However, despite the potential impact of these deficits on the functional outcome of these patients, studies examining the role of CFT in this disorder are still few in number. The present study aimed to extent previous results by evaluating CFT in the largest sample to date of schizophrenia patients in symptomatic remission and healthy controls. The relationship with symptomatology, illness duration, and sociodemographic characteristics was also explored. Methods: Seventy-eight schizophrenia patients and 84 healthy controls completed a series of tests that examined the generation of counterfactual thoughts, the influence of the “causal order effect,” and the ability to counterfactually derive inferences by using de Counterfactual Inference Test. Results: Compared with controls, patients generated fewer counterfactual thoughts when faced with a simulated scenario. This deficit was negatively related to scores on all dimensions of the Positive and Negative Syndrome Scale-PANNS, as well as to longer illness duration. The results also showed that schizophrenia patients deviated significantly from the normative pattern when generating inferences from CFT. Conclusions: These findings reveal CFT impairment to be present in schizophrenia even when patients are in symptomatic remission. However, symptomatology and illness duration may have a negative influence on these patients' ability to generate counterfactual thoughts. The results might support the relevance of targeting CFT in future treatment approaches, although further research is needed to better describe the relationship between CFT and both symptomatology and functional outcome. PMID:28111561
Greifeneder, Rainer; Zelt, Sarah; Seele, Tim; Bottenberg, Konstantin; Alt, Alexander
2012-09-01
Handwriting legibility systematically biases evaluations in that highly legible handwriting results in more positive evaluations than less legible handwriting. Because performance assessments in educational contexts are not only based on computerized or multiple choice tests but often include the evaluation of handwritten work samples, understanding the causes of this bias is critical. This research was designed to replicate and extend the legibility bias in two tightly controlled experiments and to explore whether gender-based inferences contribute to its occurrence. A total of 132 students from a German university participated in one pre-test and two independent experiments. Participants were asked to read and evaluate several handwritten essays varying in content quality. Each essay was presented to some participants in highly legible handwriting and to other participants in less legible handwriting. In addition, the assignment of legibility to participant group was reversed from essay to essay, resulting in a mixed-factor design. The legibility bias was replicated in both experiments. Results suggest that gender-based inferences do not account for its occurrence. Rather it appears that fluency from legibility exerts a biasing impact on evaluations of content and author abilities. The legibility bias was shown to be genuine and strong. By refuting a series of alternative explanations, this research contributes to a better understanding of what underlies the legibility bias. The present research may inform those who grade on what to focus and thus help to better allocate cognitive resources when trying to reduce this important source of error. ©2011 The British Psychological Society.
Two-condition within-participant statistical mediation analysis: A path-analytic framework.
Montoya, Amanda K; Hayes, Andrew F
2017-03-01
Researchers interested in testing mediation often use designs where participants are measured on a dependent variable Y and a mediator M in both of 2 different circumstances. The dominant approach to assessing mediation in such a design, proposed by Judd, Kenny, and McClelland (2001), relies on a series of hypothesis tests about components of the mediation model and is not based on an estimate of or formal inference about the indirect effect. In this article we recast Judd et al.'s approach in the path-analytic framework that is now commonly used in between-participant mediation analysis. By so doing, it is apparent how to estimate the indirect effect of a within-participant manipulation on some outcome through a mediator as the product of paths of influence. This path-analytic approach eliminates the need for discrete hypothesis tests about components of the model to support a claim of mediation, as Judd et al.'s method requires, because it relies only on an inference about the product of paths-the indirect effect. We generalize methods of inference for the indirect effect widely used in between-participant designs to this within-participant version of mediation analysis, including bootstrap confidence intervals and Monte Carlo confidence intervals. Using this path-analytic approach, we extend the method to models with multiple mediators operating in parallel and serially and discuss the comparison of indirect effects in these more complex models. We offer macros and code for SPSS, SAS, and Mplus that conduct these analyses. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A recurrent self-organizing neural fuzzy inference network.
Juang, C F; Lin, C T
1999-01-01
A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed in this paper. The RSONFIN is inherently a recurrent multilayered connectionist network for realizing the basic elements and functions of dynamic fuzzy inference, and may be considered to be constructed from a series of dynamic fuzzy rules. The temporal relations embedded in the network are built by adding some feedback connections representing the memory elements to a feedforward neural fuzzy network. Each weight as well as node in the RSONFIN has its own meaning and represents a special element in a fuzzy rule. There are no hidden nodes (i.e., no membership functions and fuzzy rules) initially in the RSONFIN. They are created on-line via concurrent structure identification (the construction of dynamic fuzzy if-then rules) and parameter identification (the tuning of the free parameters of membership functions). The structure learning together with the parameter learning forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network. Two major characteristics of the RSONFIN can thus be seen: 1) the recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and 2) no predetermination, like the number of hidden nodes, must be given, since the RSONFIN can find its optimal structure and parameters automatically and quickly. Moreover, to reduce the number of fuzzy rules generated, a flexible input partition method, the aligned clustering-based algorithm, is proposed. Various simulations on temporal problems are done and performance comparisons with some existing recurrent networks are also made. Efficiency of the RSONFIN is verified from these results.
Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob
2016-08-01
The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
Conditional adaptive Bayesian spectral analysis of nonstationary biomedical time series.
Bruce, Scott A; Hall, Martica H; Buysse, Daniel J; Krafty, Robert T
2018-03-01
Many studies of biomedical time series signals aim to measure the association between frequency-domain properties of time series and clinical and behavioral covariates. However, the time-varying dynamics of these associations are largely ignored due to a lack of methods that can assess the changing nature of the relationship through time. This article introduces a method for the simultaneous and automatic analysis of the association between the time-varying power spectrum and covariates, which we refer to as conditional adaptive Bayesian spectrum analysis (CABS). The procedure adaptively partitions the grid of time and covariate values into an unknown number of approximately stationary blocks and nonparametrically estimates local spectra within blocks through penalized splines. CABS is formulated in a fully Bayesian framework, in which the number and locations of partition points are random, and fit using reversible jump Markov chain Monte Carlo techniques. Estimation and inference averaged over the distribution of partitions allows for the accurate analysis of spectra with both smooth and abrupt changes. The proposed methodology is used to analyze the association between the time-varying spectrum of heart rate variability and self-reported sleep quality in a study of older adults serving as the primary caregiver for their ill spouse. © 2017, The International Biometric Society.
Volatility modeling for IDR exchange rate through APARCH model with student-t distribution
NASA Astrophysics Data System (ADS)
Nugroho, Didit Budi; Susanto, Bambang
2017-08-01
The aim of this study is to empirically investigate the performance of APARCH(1,1) volatility model with the Student-t error distribution on five foreign currency selling rates to Indonesian rupiah (IDR), including the Swiss franc (CHF), the Euro (EUR), the British pound (GBP), Japanese yen (JPY), and the US dollar (USD). Six years daily closing rates over the period of January 2010 to December 2016 for a total number of 1722 observations have analysed. The Bayesian inference using the efficient independence chain Metropolis-Hastings and adaptive random walk Metropolis methods in the Markov chain Monte Carlo (MCMC) scheme has been applied to estimate the parameters of model. According to the DIC criterion, this study has found that the APARCH(1,1) model under Student-t distribution is a better fit than the model under normal distribution for any observed rate return series. The 95% highest posterior density interval suggested the APARCH models to model the IDR/JPY and IDR/USD volatilities. In particular, the IDR/JPY and IDR/USD data, respectively, have significant negative and positive leverage effect in the rate returns. Meanwhile, the optimal power coefficient of volatility has been found to be statistically different from 2 in adopting all rate return series, save the IDR/EUR rate return series.
cDREM: inferring dynamic combinatorial gene regulation.
Wise, Aaron; Bar-Joseph, Ziv
2015-04-01
Genes are often combinatorially regulated by multiple transcription factors (TFs). Such combinatorial regulation plays an important role in development and facilitates the ability of cells to respond to different stresses. While a number of approaches have utilized sequence and ChIP-based datasets to study combinational regulation, these have often ignored the combinational logic and the dynamics associated with such regulation. Here we present cDREM, a new method for reconstructing dynamic models of combinatorial regulation. cDREM integrates time series gene expression data with (static) protein interaction data. The method is based on a hidden Markov model and utilizes the sparse group Lasso to identify small subsets of combinatorially active TFs, their time of activation, and the logical function they implement. We tested cDREM on yeast and human data sets. Using yeast we show that the predicted combinatorial sets agree with other high throughput genomic datasets and improve upon prior methods developed to infer combinatorial regulation. Applying cDREM to study human response to flu, we were able to identify several combinatorial TF sets, some of which were known to regulate immune response while others represent novel combinations of important TFs.
Tanaka, Shingo; Oguchi, Mineki; Sakagami, Masamichi
2016-11-01
To behave appropriately in a complex and uncertain world, the brain makes use of several distinct learning systems. One such system is called the "model-free process", via which conditioning allows the association between a stimulus or response and a given reward to be learned. Another system is called the "model-based process". Via this process, the state transition between a stimulus and a response is learned so that the brain is able to plan actions prior to their execution. Several studies have tried to relate the difference between model-based and model-free processes to the difference in functions of the lateral prefrontal cortex (LPFC) and the striatum. Here, we describe a series of studies that demonstrate the ability of LPFC neurons to categorize visual stimuli by their associated behavioral responses and to generate abstract information. If LPFC neurons utilize abstract code to associate a stimulus with a reward, they should be able to infer similar relationships between other stimuli of the same category and their rewards without direct experience of these stimulus-reward contingencies. We propose that this ability of LPFC neurons to utilize abstract information can contribute to the model-based learning process.
A Bayesian estimation of a stochastic predator-prey model of economic fluctuations
NASA Astrophysics Data System (ADS)
Dibeh, Ghassan; Luchinsky, Dmitry G.; Luchinskaya, Daria D.; Smelyanskiy, Vadim N.
2007-06-01
In this paper, we develop a Bayesian framework for the empirical estimation of the parameters of one of the best known nonlinear models of the business cycle: The Marx-inspired model of a growth cycle introduced by R. M. Goodwin. The model predicts a series of closed cycles representing the dynamics of labor's share and the employment rate in the capitalist economy. The Bayesian framework is used to empirically estimate a modified Goodwin model. The original model is extended in two ways. First, we allow for exogenous periodic variations of the otherwise steady growth rates of the labor force and productivity per worker. Second, we allow for stochastic variations of those parameters. The resultant modified Goodwin model is a stochastic predator-prey model with periodic forcing. The model is then estimated using a newly developed Bayesian estimation method on data sets representing growth cycles in France and Italy during the years 1960-2005. Results show that inference of the parameters of the stochastic Goodwin model can be achieved. The comparison of the dynamics of the Goodwin model with the inferred values of parameters demonstrates quantitative agreement with the growth cycle empirical data.
Cai, Chen; Stewart, David J; Reid, Jonathan P; Zhang, Yun-hong; Ohm, Peter; Dutcher, Cari S; Clegg, Simon L
2015-01-29
Measurements of the hygroscopic response of aerosol and the particle-to-gas partitioning of semivolatile organic compounds are crucial for providing more accurate descriptions of the compositional and size distributions of atmospheric aerosol. Concurrent measurements of particle size and composition (inferred from refractive index) are reported here using optical tweezers to isolate and probe individual aerosol droplets over extended timeframes. The measurements are shown to allow accurate retrievals of component vapor pressures and hygroscopic response through examining correlated variations in size and composition for binary droplets containing water and a single organic component. Measurements are reported for a homologous series of dicarboxylic acids, maleic acid, citric acid, glycerol, or 1,2,6-hexanetriol. An assessment of the inherent uncertainties in such measurements when measuring only particle size is provided to confirm the value of such a correlational approach. We also show that the method of molar refraction provides an accurate characterization of the compositional dependence of the refractive index of the solutions. In this method, the density of the pure liquid solute is the largest uncertainty and must be either known or inferred from subsaturated measurements with an error of <±2.5% to discriminate between different thermodynamic treatments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, J.D.L.; Robinson, P.T.
The largely Eocene Clarno Formation consists of andesitic volcaniclastic rocks interstratified with clayey paludal sediments and lava flows, and cut locally by irregular hypabyssal stocks, dikes and sills. Lateral lithofacies variations are pronounced, and intrusive and extrusive volcanic rocks appear haphazardly emplaced throughout the formation. A range of sedimentary environments is represented, including near-vent flow and breccia accumulations, bouldery high-gradient braided streams, and relatively low-gradient sandy-tuff braidplains associated with paludal deposits. The authors infer that the coarse-grained volcaniclastic rocks of the Clarno Formation accumulated largely in volcanic flank and apron settings. The stratigraphy of the formation indicates that it wasmore » formed in sedimentary lowlands into which many small volcanoes erupted; only a few, scattered remnants of large central vent volcanoes are known. The absence of systematic variation across the unit's large outcrop belt argues against the derivation of the succession from a line of volcanoes beyond the reaches of the present outcrop. The authors infer that the arc was composed of small to medium-sized volcanoes arranged non-systematically over a broad area. The sedimentary succession most probably accumulated in a series of shallow intra-arc depressions formed by crustal stretching and diffuse block rotation driven by oblique subduction during the Eocene.« less
Arenas, Ailan F; Salcedo, Gladys E; Gomez-Marin, Jorge E
2017-01-01
Pathogen-host protein-protein interaction systems examine the interactions between the protein repertoires of 2 distinct organisms. Some of these pathogen proteins interact with the host protein system and may manipulate it for their own advantages. In this work, we designed an R script by concatenating 2 functions called rowDM and rowCVmed to infer pathogen-host interaction using previously reported microarray data, including host gene enrichment analysis and the crossing of interspecific domain-domain interactions. We applied this script to the Toxoplasma-host system to describe pathogen survival mechanisms from human, mouse, and Toxoplasma Gene Expression Omnibus series. Our outcomes exhibited similar results with previously reported microarray analyses, but we found other important proteins that could contribute to toxoplasma pathogenesis. We observed that Toxoplasma ROP38 is the most differentially expressed protein among toxoplasma strains. Enrichment analysis and KEGG mapping indicated that the human retinal genes most affected by Toxoplasma infections are those related to antiapoptotic mechanisms. We suggest that proteins PIK3R1, PRKCA, PRKCG, PRKCB, HRAS, and c-JUN could be the possible substrates for differentially expressed Toxoplasma kinase ROP38. Likewise, we propose that Toxoplasma causes overexpression of apoptotic suppression human genes. PMID:29317802
Effects of Storage Time on Glycolysis in Donated Human Blood Units
Qi, Zhen; Roback, John D.; Voit, Eberhard O.
2017-01-01
Background: Donated blood is typically stored before transfusions. During storage, the metabolism of red blood cells changes, possibly causing storage lesions. The changes are storage time dependent and exhibit donor-specific variations. It is necessary to uncover and characterize the responsible molecular mechanisms accounting for such biochemical changes, qualitatively and quantitatively; Study Design and Methods: Based on the integration of metabolic time series data, kinetic models, and a stoichiometric model of the glycolytic pathway, a customized inference method was developed and used to quantify the dynamic changes in glycolytic fluxes during the storage of donated blood units. The method provides a proof of principle for the feasibility of inferences regarding flux characteristics from metabolomics data; Results: Several glycolytic reaction steps change substantially during storage time and vary among different fluxes and donors. The quantification of these storage time effects, which are possibly irreversible, allows for predictions of the transfusion outcome of individual blood units; Conclusion: The improved mechanistic understanding of blood storage, obtained from this computational study, may aid the identification of blood units that age quickly or more slowly during storage, and may ultimately improve transfusion management in clinics. PMID:28353627
Moya, Cristina
2013-07-01
Ethnic categories uniquely structure human social worlds. People readily form stereotypes about these, and other social categories, but it is unclear whether certain dimensions are privileged for making predictions about strangers when information is limited. If humans have been living in culturally-structured groups for much of their evolutionary history, we might expect them to have adaptations for prioritizing ethno-linguistic cues as a basis for making predictions about others. We provide a strong test of this possibility through a series of studies in a field context along the Quechua-Aymara linguistic boundary in the Peruvian Altiplano where the language boundary is not particularly socially meaningful. We find evidence of such psychological priors among children and adults at this site by showing that their age, and the social categories' novelty affect participants' reliance on ethno-linguistic inductive inferences (i.e. one-to-many predictions). Studies 1-3 show that participants make more ethno-linguistic inferences when the social categories are more removed from their real-world context. Additionally, in Study 4 when the category is marked with acoustic cues of language use, young children rely heavily on ethno-linguistic predictions, even though adults do not.
Moya, Cristina
2013-01-01
Ethnic categories uniquely structure human social worlds. People readily form stereotypes about these, and other social categories, but it is unclear whether certain dimensions are privileged for making predictions about strangers when information is limited. If humans have been living in culturally-structured groups for much of their evolutionary history, we might expect them to have adaptations for prioritizing ethno-linguistic cues as a basis for making predictions about others. We provide a strong test of this possibility through a series of studies in a field context along the Quechua–Aymara linguistic boundary in the Peruvian Altiplano where the language boundary is not particularly socially meaningful. We find evidence of such psychological priors among children and adults at this site by showing that their age, and the social categories’ novelty affect participants’ reliance on ethno-linguistic inductive inferences (i.e. one-to-many predictions). Studies 1–3 show that participants make more ethno-linguistic inferences when the social categories are more removed from their real-world context. Additionally, in Study 4 when the category is marked with acoustic cues of language use, young children rely heavily on ethno-linguistic predictions, even though adults do not. PMID:24072962
Effects of Storage Time on Glycolysis in Donated Human Blood Units.
Qi, Zhen; Roback, John D; Voit, Eberhard O
2017-03-29
Background : Donated blood is typically stored before transfusions. During storage, the metabolism of red blood cells changes, possibly causing storage lesions. The changes are storage time dependent and exhibit donor-specific variations. It is necessary to uncover and characterize the responsible molecular mechanisms accounting for such biochemical changes, qualitatively and quantitatively; Study Design and Methods : Based on the integration of metabolic time series data, kinetic models, and a stoichiometric model of the glycolytic pathway, a customized inference method was developed and used to quantify the dynamic changes in glycolytic fluxes during the storage of donated blood units. The method provides a proof of principle for the feasibility of inferences regarding flux characteristics from metabolomics data; Results : Several glycolytic reaction steps change substantially during storage time and vary among different fluxes and donors. The quantification of these storage time effects, which are possibly irreversible, allows for predictions of the transfusion outcome of individual blood units; Conclusion : The improved mechanistic understanding of blood storage, obtained from this computational study, may aid the identification of blood units that age quickly or more slowly during storage, and may ultimately improve transfusion management in clinics.
An associative model of adaptive inference for learning word-referent mappings.
Kachergis, George; Yu, Chen; Shiffrin, Richard M
2012-04-01
People can learn word-referent pairs over a short series of individually ambiguous situations containing multiple words and referents (Yu & Smith, 2007, Cognition 106: 1558-1568). Cross-situational statistical learning relies on the repeated co-occurrence of words with their intended referents, but simple co-occurrence counts cannot explain the findings. Mutual exclusivity (ME: an assumption of one-to-one mappings) can reduce ambiguity by leveraging prior experience to restrict the number of word-referent pairings considered but can also block learning of non-one-to-one mappings. The present study first trained learners on one-to-one mappings with varying numbers of repetitions. In late training, a new set of word-referent pairs were introduced alongside pretrained pairs; each pretrained pair consistently appeared with a new pair. Results indicate that (1) learners quickly infer new pairs in late training on the basis of their knowledge of pretrained pairs, exhibiting ME; and (2) learners also adaptively relax the ME bias and learn two-to-two mappings involving both pretrained and new words and objects. We present an associative model that accounts for both results using competing familiarity and uncertainty biases.
[Concepts of rational taxonomy].
Pavlinov, I Ia
2011-01-01
The problems are discussed related to development of concepts of rational taxonomy and rational classifications (taxonomic systems) in biology. Rational taxonomy is based on the assumption that the key characteristic of rationality is deductive inference of certain partial judgments about reality under study from other judgments taken as more general and a priory true. Respectively, two forms of rationality are discriminated--ontological and epistemological ones. The former implies inference of classifications properties from general (essential) properties of the reality being investigated. The latter implies inference of the partial rules of judgments about classifications from more general (formal) rules. The following principal concepts of ontologically rational biological taxonomy are considered: "crystallographic" approach, inference of the orderliness of organismal diversity from general laws of Nature, inference of the above orderliness from the orderliness of ontogenetic development programs, based on the concept of natural kind and Cassirer's series theory, based on the systemic concept, based on the idea of periodic systems. Various concepts of ontologically rational taxonomy can be generalized by an idea of the causal taxonomy, according to which any biologically sound classification is founded on a contentwise model of biological diversity that includes explicit indication of general causes responsible for that diversity. It is asserted that each category of general causation and respective background model may serve as a basis for a particular ontologically rational taxonomy as a distinctive research program. Concepts of epistemologically rational taxonomy and classifications (taxonomic systems) can be interpreted in terms of application of certain epistemological criteria of substantiation of scientific status of taxonomy in general and of taxonomic systems in particular. These concepts include: consideration of taxonomy consistency from the standpoint of inductive and hypothetico-deductive argumentation schemes and such fundamental criteria of classifications naturalness as their prognostic capabilities; foundation of a theory of "general taxonomy" as a "general logic", including elements of the axiomatic method. The latter concept constitutes a core of the program of general classiology; it is inconsistent due to absence of anything like "general logic". It is asserted that elaboration of a theory of taxonomy as a biological discipline based on the formal principles of epistemological rationality is not feasible. Instead, it is to be elaborated as ontologically rational one based on biologically sound metatheories about biological diversity causes.
Fine-scale population dynamics in a marine fish species inferred from dynamic state-space models.
Rogers, Lauren A; Storvik, Geir O; Knutsen, Halvor; Olsen, Esben M; Stenseth, Nils C
2017-07-01
Identifying the spatial scale of population structuring is critical for the conservation of natural populations and for drawing accurate ecological inferences. However, population studies often use spatially aggregated data to draw inferences about population trends and drivers, potentially masking ecologically relevant population sub-structure and dynamics. The goals of this study were to investigate how population dynamics models with and without spatial structure affect inferences on population trends and the identification of intrinsic drivers of population dynamics (e.g. density dependence). Specifically, we developed dynamic, age-structured, state-space models to test different hypotheses regarding the spatial structure of a population complex of coastal Atlantic cod (Gadus morhua). Data were from a 93-year survey of juvenile (age 0 and 1) cod sampled along >200 km of the Norwegian Skagerrak coast. We compared two models: one which assumes all sampled cod belong to one larger population, and a second which assumes that each fjord contains a unique population with locally determined dynamics. Using the best supported model, we then reconstructed the historical spatial and temporal dynamics of Skagerrak coastal cod. Cross-validation showed that the spatially structured model with local dynamics had better predictive ability. Furthermore, posterior predictive checks showed that a model which assumes one homogeneous population failed to capture the spatial correlation pattern present in the survey data. The spatially structured model indicated that population trends differed markedly among fjords, as did estimates of population parameters including density-dependent survival. Recent biomass was estimated to be at a near-record low all along the coast, but the finer scale model indicated that the decline occurred at different times in different regions. Warm temperatures were associated with poor recruitment, but local changes in habitat and fishing pressure may have played a role in driving local dynamics. More generally, we demonstrated how state-space models can be used to test evidence for population spatial structure based on survey time-series data. Our study shows the importance of considering spatially structured dynamics, as the inferences from such an approach can lead to a different ecological understanding of the drivers of population declines, and fundamentally different management actions to restore populations. © 2017 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.
Investigation of a long time series of CO2 from a tall tower using WRF-SPA
NASA Astrophysics Data System (ADS)
Smallman, Luke; Williams, Mathew; Moncrieff, John B.
2013-04-01
Atmospheric observations from tall towers are an important source of information about CO2 exchange at the regional scale. Here, we have used a forward running model, WRF-SPA, to generate a time series of CO2 at a tall tower for comparison with observations from Scotland over multiple years (2006-2008). We use this comparison to infer strength and distribution of sources and sinks of carbon and ecosystem process information at the seasonal scale. The specific aim of this research is to combine a high resolution (6 km) forward running meteorological model (WRF) with a modified version of a mechanistic ecosystem model (SPA). SPA provides surface fluxes calculated from coupled energy, hydrological and carbon cycles. This closely coupled representation of the biosphere provides realistic surface exchanges to drive mixing within the planetary boundary layer. The combined model is used to investigate the sources and sinks of CO2 and to explore which land surfaces contribute to a time series of hourly observations of atmospheric CO2 at a tall tower, Angus, Scotland. In addition to comparing the modelled CO2 time series to observations, modelled ecosystem specific (i.e. forest, cropland, grassland) CO2 tracers (e.g., assimilation and respiration) have been compared to the modelled land surface assimilation to investigate how representative tall tower observations are of land surface processes. WRF-SPA modelled CO2 time series compares well to observations (R2 = 0.67, rmse = 3.4 ppm, bias = 0.58 ppm). Through comparison of model-observation residuals, we have found evidence that non-cropped components of agricultural land (e.g., hedgerows and forest patches) likely contribute a significant and observable impact on regional carbon balance.
Dishman, J Donald; Weber, Kenneth A; Corbin, Roger L; Burke, Jeanmarie R
2012-09-30
The purpose of this research was to characterize unique neurophysiologic events following a high velocity, low amplitude (HVLA) spinal manipulation (SM) procedure. Descriptive time series analysis techniques of time plots, outlier detection and autocorrelation functions were applied to time series of tibial nerve H-reflexes that were evoked at 10-s intervals from 100 s before the event until 100 s after three distinct events L5-S1 HVLA SM, or a L5-S1 joint pre-loading procedure, or the control condition. Sixty-six subjects were randomly assigned to three procedures, i.e., 22 time series per group. If the detection of outliers and correlograms revealed a pattern of non-randomness that was only time-locked to a single, specific event in the normalized time series, then an experimental effect would be inferred beyond the inherent variability of H-reflex responses. Tibial nerve F-wave responses were included to determine if any new information about central nervous function following a HVLA SM procedure could be ascertained. Time series analyses of H(max)/M(max) ratios, pre-post L5-S1 HVLA SM, substantiated the hypothesis that the specific aspects of the manipulative thrust lead to a greater attenuation of the H(max)/M(max) ratio as compared to the non-specific aspects related to the postural perturbation and joint pre-loading. The attenuation of the H(max)/M(max) ratio following the HVLA SM procedure was reliable and may hold promise as a translational tool to measure the consistency and accuracy of protocol implementation involving SM in clinical trials research. F-wave responses were not sensitive to mechanical perturbations of the lumbar spine. Copyright © 2012 Elsevier B.V. All rights reserved.
Spatial Representativeness of Surface-Measured Variations of Downward Solar Radiation
NASA Astrophysics Data System (ADS)
Schwarz, M.; Folini, D.; Hakuba, M. Z.; Wild, M.
2017-12-01
When using time series of ground-based surface solar radiation (SSR) measurements in combination with gridded data, the spatial and temporal representativeness of the point observations must be considered. We use SSR data from surface observations and high-resolution (0.05°) satellite-derived data to infer the spatiotemporal representativeness of observations for monthly and longer time scales in Europe. The correlation analysis shows that the squared correlation coefficients (R2) between SSR times series decrease linearly with increasing distance between the surface observations. For deseasonalized monthly mean time series, R2 ranges from 0.85 for distances up to 25 km between the stations to 0.25 at distances of 500 km. A decorrelation length (i.e., the e-folding distance of R2) on the order of 400 km (with spread of 100-600 km) was found. R2 from correlations between point observations and colocated grid box area means determined from satellite data were found to be 0.80 for a 1° grid. To quantify the error which arises when using a point observation as a surrogate for the area mean SSR of larger surroundings, we calculated a spatial sampling error (SSE) for a 1° grid of 8 (3) W/m2 for monthly (annual) time series. The SSE based on a 1° grid, therefore, is of the same magnitude as the measurement uncertainty. The analysis generally reveals that monthly mean (or longer temporally aggregated) point observations of SSR capture the larger-scale variability well. This finding shows that comparing time series of SSR measurements with gridded data is feasible for those time scales.
MacDonald, Cameron W; Whitman, Julie M; Cleland, Joshua A; Smith, Marcia; Hoeksma, Hugo L
2006-08-01
Case series describing the outcomes of individual patients with hip osteoarthritis treated with manual physical therapy and exercise. Seven patients referred to physical therapy with hip osteoarthritis and/or hip pain were included in this case series. All patients were treated with manual physical therapy followed by exercises to maximize strength and range of motion. Six of 7 patients completed a Harris Hip Score at initial examination and discharge from physical therapy, and 1 patient completed a Global Rating of Change Scale at discharge. Three males and 4 females with a median age of 62 years (range, 52-80 years) and median duration of symptoms of 9 months (range, 2-60 months) participated in this case series. The median number of physical therapy sessions attended was 5 (range, 4-12). The median increase in total passive range of motion of the hip was 82 degrees (range, 70 degrees-86 degrees). The median improvement on the Harris Hip Score was 25 points (range, 15-38 points). The single patient who completed the Global Rating of Change Scale at discharge reported being "a great deal better." Numeric pain rating scores decreased by a mean of 5 points (range, 2-7 points) on 0-to-10-point scale. All patients exhibited reductions in pain and increases in passive range of motion, as well as a clinically meaningful improvement in function. Although we can not infer a cause and effect relationship from a case series, the outcomes with these patients are similar to others reported in the literature that have demonstrated superior clinical outcomes associated with manual physical therapy and exercise for hip osteoarthritis compared to exercise alone.
Decadal-Scale Crustal Deformation Transients in Japan Prior to the March 11, 2011 Tohoku Earthquake
NASA Astrophysics Data System (ADS)
Mavrommatis, A. P.; Segall, P.; Miyazaki, S.; Owen, S. E.; Moore, A. W.
2012-12-01
Excluding postseismic transients and slow-slip events, interseismic deformation is generally believed to accumulate linearly in time. We test this assumption using data from Japan's GPS Earth Observation Network System (GEONET), which provides high-precision time series spanning over 10 years. Here we report regional signals of decadal transients that in some cases appear to be unrelated to any known source of deformation. We analyze GPS position time series processed independently, using the BERNESE and GIPSY-PPP software, provided by the Geospatial Information Authority of Japan (GSI) and a collaborative effort of Jet Propulsion Laboratory (JPL) and Dr. Mark Simons (Caltech), respectively. We use time series from 891 GEONET stations, spanning an average of ~14 years prior to the Mw 9.0 March 11, 2011 Tohoku earthquake. We assume a time series model that includes a linear term representing constant velocity, as well as a quadratic term representing constant acceleration. Postseismic transients, where observed, are modeled by A log(1 + t/tc). We also model seasonal terms and antenna offsets, and solve for the best-fitting parameters using standard nonlinear least squares. Uncertainties in model parameters are determined by linear propagation of errors. Noise parameters are inferred from time series that lack obvious transients using maximum-likelihood estimation and assuming a combination of power-law and white noise. Resulting velocity uncertainties are on the order of 1.0 to 1.5 mm/yr. Excluding stations with high misfit to the time series model, our results reveal several spatially coherent patterns of statistically significant (at as much as 5σ) apparent crustal acceleration in various regions of Japan. The signal exhibits similar patterns in both the GSI and JPL solutions and is not coherent across the entire network, which indicates that the pattern is not a reference frame artifact. We interpret most of the accelerations to represent transient deformation due to known sources, including slow-slip events (e.g., the post-2000 Tokai event) or postseismic transients due to large earthquakes prior to 1996 (e.g., the M 7.7 1993 Hokkaido-Nansei-Oki and M 7.7 1994 Sanriku-Oki earthquakes). Viscoelastic modeling will be required to confirm the influence of past earthquakes on the acceleration field. In addition to these signals, we find spatially coherent accelerations in the Tohoku and Kyushu regions. Specifically, we observe generally southward acceleration extending for ~400 km near the west coast of Tohoku, east-southeastward acceleration covering ~200 km along the southeast coast of Tohoku, and west-northwestward acceleration spanning ~100 km across the south coast of Kyushu. Interestingly, the eastward acceleration field in Tohoku is spatially correlated with the extent of the March 11, 2011 Mw 9.0 rupture area. We note that the inferred acceleration is present prior to the sequence of M 7+ earthquakes beginning in 2003, and that short-term transients following these events have been accounted for in the analysis. A possible, although non-unique, cause of the acceleration is increased slip rate on the Japan Trench. However, such widespread changes would not be predicted by standard earthquake nucleation models.
NASA Astrophysics Data System (ADS)
Wang, Yaying; Zeng, Lingsen; Asimow, Paul D.; Gao, Li-E.; Ma, Chi; Antoshechkina, Paula M.; Guo, Chunli; Hou, Kejun; Tang, Suohan
2018-01-01
The Dala diabase intrusion, at the southeastern margin of the Yardoi gneiss dome, is located within the outcrop area of the 132 Ma Comei Large Igneous Province (LIP), the result of initial activity of the Kerguelen plume. We present new zircon U-Pb geochronology results to show that the Dala diabase was emplaced at 132 Ma and geochemical data (whole-rock element and Sr-Nd isotope ratios, zircon Hf isotopes and Fe-Ti oxide mineral chemistry) to confirm that the Dala diabase intrusion is part of the Comei LIP. The Dala diabase can be divided into a high-Mg/low-Ti series and a low-Mg/high-Ti series. The high-Mg/low-Ti series represents more primitive mafic magma compositions that we demonstrate are parental to the low-Mg/high-Ti series. Fractionation of olivine and clinopyroxene, followed by plagioclase within the low-Mg series, lead to systematic changes in concentrations of mantle compatible elements (Cr, Co, Ni, and V), REEs, HFSEs, and major elements such as Ti and P. Some Dala samples from the low-Mg/high-Ti series contain large ilmenite clusters and show extreme enrichment of Ti with elevated Ti/Y ratios, likely due to settling and accumulation of ilmenite during the magma chamber evolution. However, most samples from throughout the Comei LIP follow the Ti-evolution trend of the typical liquid line of descent (LLD) of primary OIB compositions, showing strong evidence of control of Ti contents by differentiation processes. In many other localities, however, primitive magmas are absent and observed Ti contents of evolved magmas cannot be quantitatively related to source processes. Careful examination of the petrogenetic relationship between co-existing low-Ti and high-Ti mafic rocks is essential to using observed rock chemistry to infer source composition, location, and degree of melting.
NASA Astrophysics Data System (ADS)
Barraza Bernadas, V.; Grings, F.; Roitberg, E.; Perna, P.; Karszenbaum, H.
2017-12-01
The Dry Chaco region (DCF) has the highest absolute deforestation rates of all Argentinian forests. The most recent report indicates a current deforestation rate of 200,000 Ha year-1. In order to better monitor this process, DCF was chosen to implement an early warning program for illegal deforestation. Although the area is intensively studied using medium resolution imagery (Landsat), the products obtained have a yearly pace and therefore unsuited for an early warning program. In this paper, we evaluated the performance of an online Bayesian change-point detection algorithm for MODIS Enhanced Vegetation Index (EVI) and Land Surface Temperature (LST) datasets. The goal was to to monitor the abrupt changes in vegetation dynamics associated with deforestation events. We tested this model by simulating 16-day EVI and 8-day LST time series with varying amounts of seasonality, noise, length of the time series and by adding abrupt changes with different magnitudes. This model was then tested on real satellite time series available through the Google Earth Engine, over a pilot area in DCF, where deforestation was common in the 2004-2016 period. A comparison with yearly benchmark products based on Landsat images is also presented (REDAF dataset). The results shows the advantages of using an automatic model to detect a changepoint in the time series than using only visual inspection techniques. Simulating time series with varying amounts of seasonality and noise, and by adding abrupt changes at different times and magnitudes, revealed that this model is robust against noise, and is not influenced by changes in amplitude of the seasonal component. Furthermore, the results compared favorably with REDAF dataset (near 65% of agreement). These results show the potential to combine LST and EVI to identify deforestation events. This work is being developed within the frame of the national Forest Law for the protection and sustainable development of Native Forest in Argentina in agreement with international legislation (REDD+).
Takahashi, Daiki; Teramine, Tsutomu; Sakaguchi, Shota; Setoguchi, Hiroaki
2018-01-25
Clines, the gradual variation in measurable traits along a geographical axis, play a major role in evolution and can contribute to our understanding of the relative roles of selective and neutral process in trait variation. Using genetic and morphological analyses, the relative contributions of neutral and non-neutral processes were explored to infer the evolutionary history of species of the series Sakawanum (genus Asarum), which shows significant clinal variation in calyx lobe length. A total of 27 populations covering the natural geographical distribution of the series Sakawanum were sampled. Six nuclear microsatellite markers were used to investigate genetic structure and genetic diversity. The lengths of calyx lobes of multiple populations were measured to quantify their geographical and taxonomic differentiation. To detect the potential impact of selective pressure, morphological differentiation was compared with genetic differentiation (QCT-FST comparison). Average calyx lobe length of A. minamitanianum was 124.11 mm, while that of A. costatum was 13.80 mm. Though gradually changing along the geographical axis within series, calyx lobe lengths were significantly differentiated among the taxa. Genetic differentiation between taxa was low (FST = 0.099), but a significant geographical structure along the morphological cline was detected. Except for one taxon pair, pairwise QCT values were significantly higher than the neutral genetic measures of FST and G'ST. Divergent selection may have driven the calyx lobe length variation in series Sakawanum taxa, although the underlying mechanism is still not clear. The low genetic differentiation indicates recent divergence and/or gene flows between geographically close taxa. These neutral processes would also affect the clinal variation in calyx lobe lengths. Overall, this study implies the roles of population history and divergent selection in shaping the current cline of a flower trait in the series Sakawanum. © The Author(s) 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Estimating trends in atmospheric water vapor and temperature time series over Germany
NASA Astrophysics Data System (ADS)
Alshawaf, Fadwa; Balidakis, Kyriakos; Dick, Galina; Heise, Stefan; Wickert, Jens
2017-08-01
Ground-based GNSS (Global Navigation Satellite System) has efficiently been used since the 1990s as a meteorological observing system. Recently scientists have used GNSS time series of precipitable water vapor (PWV) for climate research. In this work, we compare the temporal trends estimated from GNSS time series with those estimated from European Center for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA-Interim) data and meteorological measurements. We aim to evaluate climate evolution in Germany by monitoring different atmospheric variables such as temperature and PWV. PWV time series were obtained by three methods: (1) estimated from ground-based GNSS observations using the method of precise point positioning, (2) inferred from ERA-Interim reanalysis data, and (3) determined based on daily in situ measurements of temperature and relative humidity. The other relevant atmospheric parameters are available from surface measurements of meteorological stations or derived from ERA-Interim. The trends are estimated using two methods: the first applies least squares to deseasonalized time series and the second uses the Theil-Sen estimator. The trends estimated at 113 GNSS sites, with 10 to 19 years temporal coverage, vary between -1.5 and 2.3 mm decade-1 with standard deviations below 0.25 mm decade-1. These results were validated by estimating the trends from ERA-Interim data over the same time windows, which show similar values. These values of the trend depend on the length and the variations of the time series. Therefore, to give a mean value of the PWV trend over Germany, we estimated the trends using ERA-Interim spanning from 1991 to 2016 (26 years) at 227 synoptic stations over Germany. The ERA-Interim data show positive PWV trends of 0.33 ± 0.06 mm decade-1 with standard errors below 0.03 mm decade-1. The increment in PWV varies between 4.5 and 6.5 % per degree Celsius rise in temperature, which is comparable to the theoretical rate of the Clausius-Clapeyron equation.
Umari, P; Petrenko, O; Taioli, S; De Souza, M M
2012-05-14
Electronic band gaps for optically allowed transitions are calculated for a series of semiconducting single-walled zig-zag carbon nanotubes of increasing diameter within the many-body perturbation theory GW method. The dependence of the evaluated gaps with respect to tube diameters is then compared with those found from previous experimental data for optical gaps combined with theoretical estimations of exciton binding energies. We find that our GW gaps confirm the behavior inferred from experiment. The relationship between the electronic gap and the diameter extrapolated from the GW values is also in excellent agreement with a direct measurement recently performed through scanning tunneling spectroscopy.
Theoretical results which strengthen the hypothesis of electroweak bioenantioselection
NASA Astrophysics Data System (ADS)
Zanasi, R.; Lazzeretti, P.; Ligabue, A.; Soncini, A.
1999-03-01
It is shown via a large series of numerical tests on two fundamental organic molecules, the L-α-amino acid L-valine and the sugar precursor hydrated D-glyceraldheyde, that the ab initio calculation of the parity-violating energy shift, at the random-phase approximation level of accuracy, provides results that are about one order of magnitude larger than those obtained by means of less accurate methods employed previously. These findings would make more plausible the hypothesis of electroweak selection of natural enantiomers via the Kondepudi-Nelson scenario, or could imply that Salam phase-transition temperature is higher than previously inferred: accordingly, the hypothesis of terrestrial origin of life would become more realistic.
Quantifying the risk of extreme aviation accidents
NASA Astrophysics Data System (ADS)
Das, Kumer Pial; Dey, Asim Kumer
2016-12-01
Air travel is considered a safe means of transportation. But when aviation accidents do occur they often result in fatalities. Fortunately, the most extreme accidents occur rarely. However, 2014 was the deadliest year in the past decade causing 111 plane crashes, and among them worst four crashes cause 298, 239, 162 and 116 deaths. In this study, we want to assess the risk of the catastrophic aviation accidents by studying historical aviation accidents. Applying a generalized Pareto model we predict the maximum fatalities from an aviation accident in future. The fitted model is compared with some of its competitive models. The uncertainty in the inferences are quantified using simulated aviation accident series, generated by bootstrap resampling and Monte Carlo simulations.
Spatial and temporal variations in lagoon and coastal processes of the southern Brazilian coast
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Herz, R.
1980-01-01
From a collection of information gathered during a long period, through the orbital platforms SKYLAB and LANDSAT, it was possible to establish a method for the systematic study of the dynamical regime of lagoon and marine surface waters, on coastal plain of Rio Grande do Sul. The series of multispectral images analyzed by visual and automatic techniques put in evidence spatial and temporal variations reflected in the optical properties of waters, which carry different loads of materials in suspension. The identified patterns offer a synoptic picture of phenomena of great amplitude, from which trends of circulation can be inferred, correlating the atmospheric and hydrologic variables simultaneously to the overflight of orbital vehicles.
The basis function approach for modeling autocorrelation in ecological data
Hefley, Trevor J.; Broms, Kristin M.; Brost, Brian M.; Buderman, Frances E.; Kay, Shannon L.; Scharf, Henry; Tipton, John; Williams, Perry J.; Hooten, Mevin B.
2017-01-01
Analyzing ecological data often requires modeling the autocorrelation created by spatial and temporal processes. Many seemingly disparate statistical methods used to account for autocorrelation can be expressed as regression models that include basis functions. Basis functions also enable ecologists to modify a wide range of existing ecological models in order to account for autocorrelation, which can improve inference and predictive accuracy. Furthermore, understanding the properties of basis functions is essential for evaluating the fit of spatial or time-series models, detecting a hidden form of collinearity, and analyzing large data sets. We present important concepts and properties related to basis functions and illustrate several tools and techniques ecologists can use when modeling autocorrelation in ecological data.
Salicylate-induced changes in auditory thresholds of adolescent and adult rats.
Brennan, J F; Brown, C A; Jastreboff, P J
1996-01-01
Shifts in auditory intensity thresholds after salicylate administration were examined in postweanling and adult pigmented rats at frequencies ranging from 1 to 35 kHz. A total of 132 subjects from both age levels were tested under two-way active avoidance or one-way active avoidance paradigms. Estimated thresholds were inferred from behavioral responses to presentations of descending and ascending series of intensities for each test frequency value. Reliable threshold estimates were found under both avoidance conditioning methods, and compared to controls, subjects at both age levels showed threshold shifts at selective higher frequency values after salicylate injection, and the extent of shifts was related to salicylate dose level.
Soares, André E R; Schrago, Carlos G
2015-01-07
Although taxon sampling is commonly considered an important issue in phylogenetic inference, it is rarely considered in the Bayesian estimation of divergence times. In fact, the studies conducted to date have presented ambiguous results, and the relevance of taxon sampling for molecular dating remains unclear. In this study, we developed a series of simulations that, after six hundred Bayesian molecular dating analyses, allowed us to evaluate the impact of taxon sampling on chronological estimates under three scenarios of among-lineage rate heterogeneity. The first scenario allowed us to examine the influence of the number of terminals on the age estimates based on a strict molecular clock. The second scenario imposed an extreme example of lineage specific rate variation, and the third scenario permitted extensive rate variation distributed along the branches. We also analyzed empirical data on selected mitochondrial genomes of mammals. Our results showed that in the strict molecular-clock scenario (Case I), taxon sampling had a minor impact on the accuracy of the time estimates, although the precision of the estimates was greater with an increased number of terminals. The effect was similar in the scenario (Case III) based on rate variation distributed among the branches. Only under intensive rate variation among lineages (Case II) taxon sampling did result in biased estimates. The results of an empirical analysis corroborated the simulation findings. We demonstrate that taxonomic sampling affected divergence time inference but that its impact was significant if the rates deviated from those derived for the strict molecular clock. Increased taxon sampling improved the precision and accuracy of the divergence time estimates, but the impact on precision is more relevant. On average, biased estimates were obtained only if lineage rate variation was pronounced. Copyright © 2014 Elsevier Ltd. All rights reserved.
Pattern and forcing of Northern Hemisphere glacier variations during the last millennium
NASA Astrophysics Data System (ADS)
Porter, Stephen C.
1986-07-01
Time series depicting mountain glacier fluctuations in the Alps display generally similar patterns over the last two centuries, as do chronologies of glacier variations for the same interval from elsewhere in the Northern Hemisphere. Episodes of glacier advance consistently are associated with intervals of high average volcanic aerosol production, as inferred from acidity variations in a Greenland ice core. Advances occur whenever acidity levels rise sharply from background values to reach concentrations ≥1.2 μequiv H +/kg above background. A phase lag of about 10-15 yr, equivalent to reported response lags of Alpine glacier termini, separates the beginning of acidity increases from the beginning of subsequent ice advances. A similar relationship, but based on limited and less-reliable historical data and on lichenometric ages, is found for the preceding 2 centuries. Calibrated radiocarbon dates related to advances of non-calving and non-surging glaciers during the earlier part of the Little Ice Age display a comparable consistent pattern. An interval of reduced acidity values between about 1090 and 1230 A.D. correlates with a time of inferred glacier contraction during the Medieval Optimum. The observed close relation between Noothern Hemisphere glacier fluctuations and variations in Greenland ice-core acidity suggests that sulfur-rich aerosols generated by volcanic eruptions are a primary forcing mechanism of glacier fluctuations, and therefore of climate, on a decadal scale. The amount of surface cooling attributable to individual large eruptions or to episodes of eruptions is simlar to the probable average temperature reduction during culminations of Little Ice Age alacier advances (ca. 0.5°-1.2°C), as inferred from depression of equilibrium-line altitudes.
What time is it? Deep learning approaches for circadian rhythms.
Agostinelli, Forest; Ceglia, Nicholas; Shahbaba, Babak; Sassone-Corsi, Paolo; Baldi, Pierre
2016-06-15
Circadian rhythms date back to the origins of life, are found in virtually every species and every cell, and play fundamental roles in functions ranging from metabolism to cognition. Modern high-throughput technologies allow the measurement of concentrations of transcripts, metabolites and other species along the circadian cycle creating novel computational challenges and opportunities, including the problems of inferring whether a given species oscillate in circadian fashion or not, and inferring the time at which a set of measurements was taken. We first curate several large synthetic and biological time series datasets containing labels for both periodic and aperiodic signals. We then use deep learning methods to develop and train BIO_CYCLE, a system to robustly estimate which signals are periodic in high-throughput circadian experiments, producing estimates of amplitudes, periods, phases, as well as several statistical significance measures. Using the curated data, BIO_CYCLE is compared to other approaches and shown to achieve state-of-the-art performance across multiple metrics. We then use deep learning methods to develop and train BIO_CLOCK to robustly estimate the time at which a particular single-time-point transcriptomic experiment was carried. In most cases, BIO_CLOCK can reliably predict time, within approximately 1 h, using the expression levels of only a small number of core clock genes. BIO_CLOCK is shown to work reasonably well across tissue types, and often with only small degradation across conditions. BIO_CLOCK is used to annotate most mouse experiments found in the GEO database with an inferred time stamp. All data and software are publicly available on the CircadiOmics web portal: circadiomics.igb.uci.edu/ fagostin@uci.edu or pfbaldi@uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
DeMars, Craig A; Auger-Méthé, Marie; Schlägel, Ulrike E; Boutin, Stan
2013-01-01
Analyses of animal movement data have primarily focused on understanding patterns of space use and the behavioural processes driving them. Here, we analyzed animal movement data to infer components of individual fitness, specifically parturition and neonate survival. We predicted that parturition and neonate loss events could be identified by sudden and marked changes in female movement patterns. Using GPS radio-telemetry data from female woodland caribou (Rangifer tarandus caribou), we developed and tested two novel movement-based methods for inferring parturition and neonate survival. The first method estimated movement thresholds indicative of parturition and neonate loss from population-level data then applied these thresholds in a moving-window analysis on individual time-series data. The second method used an individual-based approach that discriminated among three a priori models representing the movement patterns of non-parturient females, females with surviving offspring, and females losing offspring. The models assumed that step lengths (the distance between successive GPS locations) were exponentially distributed and that abrupt changes in the scale parameter of the exponential distribution were indicative of parturition and offspring loss. Both methods predicted parturition with near certainty (>97% accuracy) and produced appropriate predictions of parturition dates. Prediction of neonate survival was affected by data quality for both methods; however, when using high quality data (i.e., with few missing GPS locations), the individual-based method performed better, predicting neonate survival status with an accuracy rate of 87%. Understanding ungulate population dynamics often requires estimates of parturition and neonate survival rates. With GPS radio-collars increasingly being used in research and management of ungulates, our movement-based methods represent a viable approach for estimating rates of both parameters. PMID:24324866
Demars, Craig A; Auger-Méthé, Marie; Schlägel, Ulrike E; Boutin, Stan
2013-10-01
Analyses of animal movement data have primarily focused on understanding patterns of space use and the behavioural processes driving them. Here, we analyzed animal movement data to infer components of individual fitness, specifically parturition and neonate survival. We predicted that parturition and neonate loss events could be identified by sudden and marked changes in female movement patterns. Using GPS radio-telemetry data from female woodland caribou (Rangifer tarandus caribou), we developed and tested two novel movement-based methods for inferring parturition and neonate survival. The first method estimated movement thresholds indicative of parturition and neonate loss from population-level data then applied these thresholds in a moving-window analysis on individual time-series data. The second method used an individual-based approach that discriminated among three a priori models representing the movement patterns of non-parturient females, females with surviving offspring, and females losing offspring. The models assumed that step lengths (the distance between successive GPS locations) were exponentially distributed and that abrupt changes in the scale parameter of the exponential distribution were indicative of parturition and offspring loss. Both methods predicted parturition with near certainty (>97% accuracy) and produced appropriate predictions of parturition dates. Prediction of neonate survival was affected by data quality for both methods; however, when using high quality data (i.e., with few missing GPS locations), the individual-based method performed better, predicting neonate survival status with an accuracy rate of 87%. Understanding ungulate population dynamics often requires estimates of parturition and neonate survival rates. With GPS radio-collars increasingly being used in research and management of ungulates, our movement-based methods represent a viable approach for estimating rates of both parameters.
Winkler, Manuela; Escobar García, Pedro; Gattringer, Andreas; Sonnleitner, Michaela; Hülber, Karl; Schönswetter, Peter; Schneeweiss, Gerald M
2017-09-01
Despite its evolutionary and ecological relevance, the mode of polyploid origin has been notoriously difficult to be reconstructed from molecular data. Here, we present a method to identify the putative parents of polyploids and thus to infer the mode of their origin (auto- vs. allopolyploidy) from Amplified Fragment Length Polymorphism (AFLP) data. To this end, we use Cohen's d of distances between in silico polyploids, generated within a priori defined scenarios of origin from a priori delimited putative parental entities (e.g. taxa, genetic lineages), and natural polyploids. Simulations show that the discriminatory power of the proposed method increases mainly with increasing divergence between the lower-ploid putative ancestors and less so with increasing delay of polyploidization relative to the time of divergence. We apply the new method to the Senecio carniolicus aggregate, distributed in the European Alps and comprising two diploid, one tetraploid and one hexaploid species. In the eastern part of its distribution, the S. carniolicus aggregate was inferred to comprise an autopolyploid series, whereas for western populations of the tetraploid species, an allopolyploid origin involving the two diploid species was the most likely scenario. Although this suggests that the tetraploid species has two independent origins, other evidence (ribotype distribution, morphology) is consistent with the hypothesis of an autopolyploid origin with subsequent introgression by the second diploid species. Altogether, identifying the best among alternative scenarios using Cohen's d can be straightforward, but particular scenarios, such as allopolyploid origin vs. autopolyploid origin with subsequent introgression, remain difficult to be distinguished. © 2016 John Wiley & Sons Ltd.
What time is it? Deep learning approaches for circadian rhythms
Agostinelli, Forest; Ceglia, Nicholas; Shahbaba, Babak; Sassone-Corsi, Paolo; Baldi, Pierre
2016-01-01
Motivation: Circadian rhythms date back to the origins of life, are found in virtually every species and every cell, and play fundamental roles in functions ranging from metabolism to cognition. Modern high-throughput technologies allow the measurement of concentrations of transcripts, metabolites and other species along the circadian cycle creating novel computational challenges and opportunities, including the problems of inferring whether a given species oscillate in circadian fashion or not, and inferring the time at which a set of measurements was taken. Results: We first curate several large synthetic and biological time series datasets containing labels for both periodic and aperiodic signals. We then use deep learning methods to develop and train BIO_CYCLE, a system to robustly estimate which signals are periodic in high-throughput circadian experiments, producing estimates of amplitudes, periods, phases, as well as several statistical significance measures. Using the curated data, BIO_CYCLE is compared to other approaches and shown to achieve state-of-the-art performance across multiple metrics. We then use deep learning methods to develop and train BIO_CLOCK to robustly estimate the time at which a particular single-time-point transcriptomic experiment was carried. In most cases, BIO_CLOCK can reliably predict time, within approximately 1 h, using the expression levels of only a small number of core clock genes. BIO_CLOCK is shown to work reasonably well across tissue types, and often with only small degradation across conditions. BIO_CLOCK is used to annotate most mouse experiments found in the GEO database with an inferred time stamp. Availability and Implementation: All data and software are publicly available on the CircadiOmics web portal: circadiomics.igb.uci.edu/. Contacts: fagostin@uci.edu or pfbaldi@uci.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307647
New insights into soil temperature time series modeling: linear or nonlinear?
NASA Astrophysics Data System (ADS)
Bonakdari, Hossein; Moeeni, Hamid; Ebtehaj, Isa; Zeynoddin, Mohammad; Mahoammadian, Abdolmajid; Gharabaghi, Bahram
2018-03-01
Soil temperature (ST) is an important dynamic parameter, whose prediction is a major research topic in various fields including agriculture because ST has a critical role in hydrological processes at the soil surface. In this study, a new linear methodology is proposed based on stochastic methods for modeling daily soil temperature (DST). With this approach, the ST series components are determined to carry out modeling and spectral analysis. The results of this process are compared with two linear methods based on seasonal standardization and seasonal differencing in terms of four DST series. The series used in this study were measured at two stations, Champaign and Springfield, at depths of 10 and 20 cm. The results indicate that in all ST series reviewed, the periodic term is the most robust among all components. According to a comparison of the three methods applied to analyze the various series components, it appears that spectral analysis combined with stochastic methods outperformed the seasonal standardization and seasonal differencing methods. In addition to comparing the proposed methodology with linear methods, the ST modeling results were compared with the two nonlinear methods in two forms: considering hydrological variables (HV) as input variables and DST modeling as a time series. In a previous study at the mentioned sites, Kim and Singh Theor Appl Climatol 118:465-479, (2014) applied the popular Multilayer Perceptron (MLP) neural network and Adaptive Neuro-Fuzzy Inference System (ANFIS) nonlinear methods and considered HV as input variables. The comparison results signify that the relative error projected in estimating DST by the proposed methodology was about 6%, while this value with MLP and ANFIS was over 15%. Moreover, MLP and ANFIS models were employed for DST time series modeling. Due to these models' relatively inferior performance to the proposed methodology, two hybrid models were implemented: the weights and membership function of MLP and ANFIS (respectively) were optimized with the particle swarm optimization (PSO) algorithm in conjunction with the wavelet transform and nonlinear methods (Wavelet-MLP & Wavelet-ANFIS). A comparison of the proposed methodology with individual and hybrid nonlinear models in predicting DST time series indicates the lowest Akaike Information Criterion (AIC) index value, which considers model simplicity and accuracy simultaneously at different depths and stations. The methodology presented in this study can thus serve as an excellent alternative to complex nonlinear methods that are normally employed to examine DST.
On the influence of solar activity on the mid-latitude sporadic E layer
NASA Astrophysics Data System (ADS)
Pezzopane, Michael; Pignalberi, Alessio; Pietrella, Marco
2015-09-01
To investigate the influence of solar cycle variability on the sporadic E layer (Es), hourly measurements of the critical frequency of the Es ordinary mode of propagation, foEs, and of the blanketing frequency of the Es layer, fbEs, recorded from January 1976 to December 2009 at the Rome (Italy) ionospheric station (41.8° N, 12.5° E), were examined. The results are: (1) a high positive correlation between the F10.7 solar index and foEs as well as between F10.7 and fbEs, both for the whole data set and for each solar cycle separately, the correlation between F10.7 and fbEs being much higher than the one between F10.7 and foEs; (2) a decreasing long-term trend of the F10.7, foEs and fbEs time series, with foEs decreasing more rapidly than F10.7 and fbEs; (3) clear and statistically significant peaks at 11 years in the foEs and fbEs time series, inferred from Lomb-Scargle periodograms.
Jacob, Robin; Somers, Marie-Andree; Zhu, Pei; Bloom, Howard
2016-06-01
In this article, we examine whether a well-executed comparative interrupted time series (CITS) design can produce valid inferences about the effectiveness of a school-level intervention. This article also explores the trade-off between bias reduction and precision loss across different methods of selecting comparison groups for the CITS design and assesses whether choosing matched comparison schools based only on preintervention test scores is sufficient to produce internally valid impact estimates. We conduct a validation study of the CITS design based on the federal Reading First program as implemented in one state using results from a regression discontinuity design as a causal benchmark. Our results contribute to the growing base of evidence regarding the validity of nonexperimental designs. We demonstrate that the CITS design can, in our example, produce internally valid estimates of program impacts when multiple years of preintervention outcome data (test scores in the present case) are available and when a set of reasonable criteria are used to select comparison organizations (schools in the present case). © The Author(s) 2016.
Nonlinear degradation of a visible-light communication link: A Volterra-series approach
NASA Astrophysics Data System (ADS)
Kamalakis, Thomas; Dede, Georgia
2018-06-01
Visible light communications can be used to provide illumination and data communication at the same time. In this paper, a reverse-engineering approach is presented for assessing the impact of nonlinear signal distortion in visible light communication links. The approach is based on the Volterra series expansion and has the advantage of accurately accounting for memory effects in contrast to the static nonlinear models that are popular in the literature. Volterra kernels describe the end-to-end system response and can be inferred from measurements. Consequently, this approach does not rely on any particular physical models and assumptions regarding the individual link components. We provide the necessary framework for estimating the nonlinear distortion on the symbol estimates of a discrete multitone modulated link. Various design aspects such as waveform clipping and predistortion are also incorporated in the analysis. Using this framework, the nonlinear signal-to-interference is calculated for the system at hand. It is shown that at high signal amplitudes, the nonlinear signal-to-interference can be less than 25 dB.
NASA Technical Reports Server (NTRS)
Hailperin, M.
1993-01-01
This thesis provides design and analysis of techniques for global load balancing on ensemble architectures running soft-real-time object-oriented applications with statistically periodic loads. It focuses on estimating the instantaneous average load over all the processing elements. The major contribution is the use of explicit stochastic process models for both the loading and the averaging itself. These models are exploited via statistical time-series analysis and Bayesian inference to provide improved average load estimates, and thus to facilitate global load balancing. This thesis explains the distributed algorithms used and provides some optimality results. It also describes the algorithms' implementation and gives performance results from simulation. These results show that the authors' techniques allow more accurate estimation of the global system loading, resulting in fewer object migrations than local methods. The authors' method is shown to provide superior performance, relative not only to static load-balancing schemes but also to many adaptive load-balancing methods. Results from a preliminary analysis of another system and from simulation with a synthetic load provide some evidence of more general applicability.
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
False Dichotomies and Health Policy Research Designs: Randomized Trials Are Not Always the Answer.
Soumerai, Stephen B; Ceccarelli, Rachel; Koppel, Ross
2017-02-01
Some medical scientists argue that only data from randomized controlled trials (RCTs) are trustworthy. They claim data from natural experiments and administrative data sets are always spurious and cannot be used to evaluate health policies and other population-wide phenomena in the real world. While many acknowledge biases caused by poor study designs, in this article we argue that several valid designs using administrative data can produce strong findings, particularly the interrupted time series (ITS) design. Many policy studies neither permit nor require an RCT for cause-and-effect inference. Framing our arguments using Campbell and Stanley's classic research design monograph, we show that several "quasi-experimental" designs, especially interrupted time series (ITS), can estimate valid effects (or non-effects) of health interventions and policies as diverse as public insurance coverage, speed limits, hospital safety programs, drug abuse regulation and withdrawal of drugs from the market. We further note the recent rapid uptake of ITS and argue for expanded training in quasi-experimental designs in medical and graduate schools and in post-doctoral curricula.
Kumar, Pankaj; Tsujimura, Maki; Nakano, Takanori; Minoru, Tokumasu
2013-04-01
Considering the current poor understanding of the seawater-freshwater (SW-FW) interaction pattern at dynamic hydro-geological boundary of coastal aquifers, this work strives to study tidal effect on groundwater quality using chemical tracers combined with environmental isotopes. In situ measurement data of electrical conductivity and groundwater level along with laboratory measurement data of hydro-chemical species were compared with tidal level data measured by Hydrographic and Oceanographic Department, Saijo City, Japan for time series analysis. Result shows that diurnal tides have significant effect on groundwater level as well as its chemical characteristics; however, the magnitude of effect is different in case of different aquifers. Various scatter diagrams were plotted in order to infer mechanisms responsible for water quality change with tidal phase, and results show that cations exchange, selective movement and local SW-FW mixing were likely to be the main processes responsible for water quality changes. It was also found that geological structure of the aquifers is the most important factor affecting the intensity of tidal effect on water quality.
Smith, Justin D; Borckardt, Jeffrey J; Nash, Michael R
2012-09-01
The case-based time-series design is a viable methodology for treatment outcome research. However, the literature has not fully addressed the problem of missing observations with such autocorrelated data streams. Mainly, to what extent do missing observations compromise inference when observations are not independent? Do the available missing data replacement procedures preserve inferential integrity? Does the extent of autocorrelation matter? We use Monte Carlo simulation modeling of a single-subject intervention study to address these questions. We find power sensitivity to be within acceptable limits across four proportions of missing observations (10%, 20%, 30%, and 40%) when missing data are replaced using the Expectation-Maximization Algorithm, more commonly known as the EM Procedure (Dempster, Laird, & Rubin, 1977). This applies to data streams with lag-1 autocorrelation estimates under 0.80. As autocorrelation estimates approach 0.80, the replacement procedure yields an unacceptable power profile. The implications of these findings and directions for future research are discussed. Copyright © 2011. Published by Elsevier Ltd.
Individualized Theory of Mind (iToM): When Memory Modulates Empathy
Ciaramelli, Elisa; Bernardi, Francesco; Moscovitch, Morris
2013-01-01
Functional neuroimaging studies have noted that brain regions supporting theory of mind (ToM) overlap remarkably with those underlying episodic memory, suggesting a link between the two processes. The present study shows that memory for others’ past experiences modulates significantly our appraisal of, and reaction to, what is happening to them currently. Participants read the life story of two characters; one had experienced a long series of love-related failures, the other a long series of work-related failures. In a later faux pas recognition task, participants reported more empathy for the character unlucky in love in love-related faux pas scenarios, and for the character unlucky at work in work-related faux pas scenarios. The memory-based modulation of empathy correlated with the number of details remembered from the characters’ life story. These results suggest that individuals use memory for other people’s past experiences to simulate how they feel in similar situations they are currently facing. The integration of ToM and memory processes allows adjusting mental state inferences to fit unique social targets, constructing an individualized ToM. PMID:23378839
Scale-free avalanches in the multifractal random walk
NASA Astrophysics Data System (ADS)
Bartolozzi, M.
2007-06-01
Avalanches, or Avalanche-like, events are often observed in the dynamical behaviour of many complex systems which span from solar flaring to the Earth's crust dynamics and from traffic flows to financial markets. Self-organized criticality (SOC) is one of the most popular theories able to explain this intermittent charge/discharge behaviour. Despite a large amount of theoretical work, empirical tests for SOC are still in their infancy. In the present paper we address the common problem of revealing SOC from a simple time series without having much information about the underlying system. As a working example we use a modified version of the multifractal random walk originally proposed as a model for the stock market dynamics. The study reveals, despite the lack of the typical ingredients of SOC, an avalanche-like dynamics similar to that of many physical systems. While, on one hand, the results confirm the relevance of cascade models in representing turbulent-like phenomena, on the other, they also raise the question about the current state of reliability of SOC inference from time series analysis.
Results on SSH neural network forecasting in the Mediterranean Sea
NASA Astrophysics Data System (ADS)
Rixen, Michel; Beckers, Jean-Marie; Alvarez, Alberto; Tintore, Joaquim
2002-01-01
Nowadays, satellites are the only monitoring systems that cover almost continuously all possible ocean areas and are now an essential part of operational oceanography. A novel approach based on artificial intelligence (AI) concepts, exploits pasts time series of satellite images to infer near future ocean conditions at the surface by neural networks and genetic algorithms. The size of the AI problem is drastically reduced by splitting the spatio-temporal variability contained in the remote sensing data by using empirical orthogonal function (EOF) decomposition. The problem of forecasting the dynamics of a 2D surface field can thus be reduced by selecting the most relevant empirical modes, and non-linear time series predictors are then applied on the amplitudes only. In the present case study, we use altimetric maps of the Mediterranean Sea, combining TOPEX-POSEIDON and ERS-1/2 data for the period 1992 to 1997. The learning procedure is applied to each mode individually. The final forecast is then reconstructed form the EOFs and the forecasted amplitudes and compared to the real observed field for validation of the method.
Cavity Heating Experiments Supporting Shuttle Columbia Accident Investigation
NASA Technical Reports Server (NTRS)
Everhart, Joel L.; Berger, Karen T.; Bey, Kim S.; Merski, N. Ronald; Wood, William A.
2011-01-01
The two-color thermographic phosphor method has been used to map the local heating augmentation of scaled idealized cavities at conditions simulating the windward surface of the Shuttle Orbiter Columbia during flight STS-107. Two experiments initiated in support of the Columbia Accident Investigation were conducted in the Langley 20-Inch Mach 6 Tunnel. Generally, the first test series evaluated open (length-to-depth less than 10) rectangular cavity geometries proposed as possible damage scenarios resulting from foam and ice impact during launch at several discrete locations on the vehicle windward surface, though some closed (length-to-depth greater than 13) geometries were briefly examined. The second test series was designed to parametrically evaluate heating augmentation in closed rectangular cavities. The tests were conducted under laminar cavity entry conditions over a range of local boundary layer edge-flow parameters typical of re-entry. Cavity design parameters were developed using laminar computational predictions, while the experimental boundary layer state conditions were inferred from the heating measurements. An analysis of the aeroheating caused by cavities allowed exclusion of non-breeching damage from the possible loss scenarios being considered during the investigation.
Individualistic and Time-Varying Tree-Ring Growth to Climate Sensitivity
Carrer, Marco
2011-01-01
The development of dendrochronological time series in order to analyze climate-growth relationships usually involves first a rigorous selection of trees and then the computation of the mean tree-growth measurement series. This study suggests a change in the perspective, passing from an analysis of climate-growth relationships that typically focuses on the mean response of a species to investigating the whole range of individual responses among sample trees. Results highlight that this new approach, tested on a larch and stone pine tree-ring dataset, outperforms, in terms of information obtained, the classical one, with significant improvements regarding the strength, distribution and time-variability of the individual tree-ring growth response to climate. Moreover, a significant change over time of the tree sensitivity to climatic variability has been detected. Accordingly, the best-responder trees at any one time may not always have been the best-responders and may not continue to be so. With minor adjustments to current dendroecological protocol and adopting an individualistic approach, we can improve the quality and reliability of the ecological inferences derived from the climate-growth relationships. PMID:21829523
Variations in Stratospheric Inorganic Chlorine Between 1991 and 2006
NASA Technical Reports Server (NTRS)
Lary, D. J.; Waugh, D. W.; Douglass, A. R.; Stolarski, R. S.; Newman, P. A.; Mussa, H.
2007-01-01
So how quickly will the ozone hole recover? This depends on how quickly the chlorine content (Cl2) of the atmosphere will decline. The ozone hole forms over the Antarctic each southern spring (September and October). The extremely small ozone amounts in the ozone hole are there because of chemical reactions of ozone with chlorine. This chlorine originates largely from industrially produced chlorofluorocarbon (CFC) compounds. An international agreement, the Montreal Protocol, is drastically reducing the amount of chlorine-containing compounds that we are releasing into the atmosphere. To be able to attribute changes in stratospheric ozone to changes in chlorine we need to know the distribution of atmospheric chlorine. However, due to a lack of continuous observations of all the key chlorine gases, producing a continuous time series of stratospheric chlorine has not been achieved to date. We have for the first time devised a technique to make a 17-year time series for stratospheric chlorine that uses the long time series of HCl observations made from several space borne instruments and a neural network. The neural networks allow us to both inter-calibrate the various HCl instruments and to infer the total amount of atmospheric chlorine from HCl. These new estimates of Cl, provide a much needed critical test for current global models that currently predict significant differences in both Cl(sub y) and ozone recovery. These models exhibit differences in their projection of the recovery time and our chlorine content time series will help separate the good from the bad in these projections.
Bayesian inference of selection in a heterogeneous environment from genetic time-series data.
Gompert, Zachariah
2016-01-01
Evolutionary geneticists have sought to characterize the causes and molecular targets of selection in natural populations for many years. Although this research programme has been somewhat successful, most statistical methods employed were designed to detect consistent, weak to moderate selection. In contrast, phenotypic studies in nature show that selection varies in time and that individual bouts of selection can be strong. Measurements of the genomic consequences of such fluctuating selection could help test and refine hypotheses concerning the causes of ecological specialization and the maintenance of genetic variation in populations. Herein, I proposed a Bayesian nonhomogeneous hidden Markov model to estimate effective population sizes and quantify variable selection in heterogeneous environments from genetic time-series data. The model is described and then evaluated using a series of simulated data, including cases where selection occurs on a trait with a simple or polygenic molecular basis. The proposed method accurately distinguished neutral loci from non-neutral loci under strong selection, but not from those under weak selection. Selection coefficients were accurately estimated when selection was constant or when the fitness values of genotypes varied linearly with the environment, but these estimates were less accurate when fitness was polygenic or the relationship between the environment and the fitness of genotypes was nonlinear. Past studies of temporal evolutionary dynamics in laboratory populations have been remarkably successful. The proposed method makes similar analyses of genetic time-series data from natural populations more feasible and thereby could help answer fundamental questions about the causes and consequences of evolution in the wild. © 2015 John Wiley & Sons Ltd.
Alcohol Content in the 'Hyper-Reality' MTV Show 'Geordie Shore'.
Lowe, Eden; Britton, John; Cranwell, Jo
2018-05-01
To quantify the occurrence of alcohol content, including alcohol branding, in the popular primetime television UK Reality TV show 'Geordie Shore' Series 11. A 1-min interval coding content analysis of alcohol content in the entire DVD Series 11 of 'Geordie Shore' (10 episodes). Occurrence of alcohol use, implied use, other alcohol reference/paraphernalia or branding was recorded. All categories of alcohol were present in all episodes. 'Any alcohol' content occurred in 78%, 'actual alcohol use' in 30%, 'inferred alcohol use' in 72%, and all 'other' alcohol references occurred in 59% of all coding intervals (ACIs), respectively. Brand appearances occurred in 23% of ACIs. The most frequently observed alcohol brand was Smirnoff which appeared in 43% of all brand appearances. Episodes categorized as suitable for viewing by adolescents below the legal drinking age of 18 years comprised of 61% of all brand appearances. Alcohol content, including branding, is highly prevalent in the UK Reality TV show 'Geordie Shore' Series 11. Two-thirds of all alcohol branding occurred in episodes age-rated by the British Board of Film Classification (BBFC) as suitable for viewers aged 15 years. The organizations OfCom, Advertising Standards Authority (ASA) and the Portman Group should implement more effective policies to reduce adolescent exposure to on-screen drinking. The drinks industry should consider demanding the withdrawal of their brands from the show. Alcohol content, including branding, is highly prevalent in the MTV reality TV show 'Geordie Shore' Series 11. Current alcohol regulation is failing to protect young viewers from exposure to such content.
Marcisz, Katarzyna; Fournier, Bertrand; Gilbert, Daniel; Lamentowicz, Mariusz; Mitchell, Edward A D
2014-05-01
Peatland testate amoebae (TA) are well-established bioindicators for depth to water table (DWT), but effects of hydrological changes on TA communities have never been tested experimentally. We tested this in a field experiment by placing Sphagnum carpets (15 cm diameter) collected in hummock, lawn and pool microsites (origin) at three local conditions (dry, moist and wet) using trenches dug in a peatland. One series of samples was seeded with microorganism extract from all microsites. TA community were analysed at T0: 8-2008, T1: 5-2009 and T2: 8-2009. We analysed the data using conditional inference trees, principal response curves (PRC) and DWT inferred from TA communities using a transfer function used for paleoecological reconstruction. Density declined from T0 to T1 and then increased sharply by T2. Species richness, Simpson diversity and Simpson evenness were lower at T2 than at T0 and T1. Seeded communities had higher species richness in pool samples at T0. Pool samples tended to have higher density, lower species richness, Simpson diversity and Simpson Evenness than hummock and/or lawn samples until T1. In the PRC, the effect of origin was significant at T0 and T1, but the effect faded away by T2. Seeding effect was strongest at T1 and lowest vanished by T2. Local condition effect was strong but not in line with the wetness gradient at T1 but started to reflect it by T2. Likewise, TA-inferred DWT started to match the experimental conditions by T2, but more so in hummock and lawn samples than in pool samples. This study confirmed that TA responds to hydrological changes over a 1-year period. However, sensitivity of TA to hydrological fluctuations, and thus the accuracy of inferred DWT changes, was habitat specific, pool TA communities being least responsive to environmental changes. Lawns and hummocks may be thus better suited than pools for paleoecological reconstructions. This, however, contrasts with the higher prediction error and species' tolerance for DWT with increasing dryness observed in transfer function models.
Pillow, Bradford H
2002-01-01
Two experiments investigated kindergarten through fourth-grade children's and adults' (N = 128) ability to (1) evaluate the certainty of deductive inferences, inductive inferences, and guesses; and (2) explain the origins of inferential knowledge. When judging their own cognitive state, children in first grade and older rated deductive inferences as more certain than guesses; but when judging another person's knowledge, children did not distinguish valid inferences from invalid inferences and guesses until fourth grade. By third grade, children differentiated their own deductive inferences from inductive inferences and guesses, but only adults both differentiated deductive inferences from inductive inferences and differentiated inductive inferences from guesses. Children's recognition of their own inferences may contribute to the development of knowledge about cognitive processes, scientific reasoning, and a constructivist epistemology.
NASA Astrophysics Data System (ADS)
Dutrieux, Loïc P.; Jakovac, Catarina C.; Latifah, Siti H.; Kooistra, Lammert
2016-05-01
We developed a method to reconstruct land use history from Landsat images time-series. The method uses a breakpoint detection framework derived from the econometrics field and applicable to time-series regression models. The Breaks For Additive Season and Trend (BFAST) framework is used for defining the time-series regression models which may contain trend and phenology, hence appropriately modelling vegetation intra and inter-annual dynamics. All available Landsat data are used for a selected study area, and the time-series are partitioned into segments delimited by breakpoints. Segments can be associated to land use regimes, while the breakpoints then correspond to shifts in land use regimes. In order to further characterize these shifts, we classified the unlabelled breakpoints returned by the algorithm into their corresponding processes. We used a Random Forest classifier, trained from a set of visually interpreted time-series profiles to infer the processes and assign labels to the breakpoints. The whole approach was applied to quantifying the number of cultivation cycles in a swidden agriculture system in Brazil (state of Amazonas). Number and frequency of cultivation cycles is of particular ecological relevance in these systems since they largely affect the capacity of the forest to regenerate after land abandonment. We applied the method to a Landsat time-series of Normalized Difference Moisture Index (NDMI) spanning the 1984-2015 period and derived from it the number of cultivation cycles during that period at the individual field scale level. Agricultural fields boundaries used to apply the method were derived using a multi-temporal segmentation approach. We validated the number of cultivation cycles predicted by the method against in-situ information collected from farmers interviews, resulting in a Normalized Residual Mean Squared Error (NRMSE) of 0.25. Overall the method performed well, producing maps with coherent spatial patterns. We identified various sources of error in the approach, including low data availability in the 90s and sub-object mixture of land uses. We conclude that the method holds great promise for land use history mapping in the tropics and beyond.
NASA Astrophysics Data System (ADS)
Dutrieux, L.; Jakovac, C. C.; Siti, L. H.; Kooistra, L.
2015-12-01
We developed a method to reconstruct land use history from Landsat images time-series. The method uses a breakpoint detection framework derived from the econometrics field and applicable to time-series regression models. The BFAST framework is used for defining the time-series regression models which may contain trend and phenology, hence appropriately modelling vegetation intra and inter-annual dynamics. All available Landsat data are used, and the time-series are partitioned into segments delimited by breakpoints. Segments can be associated to land use regimes, while the breakpoints then correspond to shifts in regimes. To further characterize these shifts, we classified the unlabelled breakpoints returned by the algorithm into their corresponding processes. We used a Random Forest classifier, trained from a set of visually interpreted time-series profiles to infer the processes and assign labels to the breakpoints. The whole approach was applied to quantifying the number of cultivation cycles in a swidden agriculture system in Brazil. Number and frequency of cultivation cycles is of particular ecological relevance in these systems since they largely affect the capacity of the forest to regenerate after abandonment. We applied the method to a Landsat time-series of Normalized Difference Moisture Index (NDMI) spanning the 1984-2015 period and derived from it the number of cultivation cycles during that period at the individual field scale level. Agricultural fields boundaries used to apply the method were derived using a multi-temporal segmentation. We validated the number of cultivation cycles predicted against in-situ information collected from farmers interviews, resulting in a Normalized RMSE of 0.25. Overall the method performed well, producing maps with coherent patterns. We identified various sources of error in the approach, including low data availability in the 90s and sub-object mixture of land uses. We conclude that the method holds great promise for land use history mapping in the tropics and beyond. Spatial and temporal patterns were further analysed with an ecological perspective in a follow-up study. Results show that changes in land use patterns such as land use intensification and reduced agricultural expansion reflect the socio-economic transformations that occurred in the region
Chen, Chi-Kan
2017-07-26
The identification of genetic regulatory networks (GRNs) provides insights into complex cellular processes. A class of recurrent neural networks (RNNs) captures the dynamics of GRN. Algorithms combining the RNN and machine learning schemes were proposed to reconstruct small-scale GRNs using gene expression time series. We present new GRN reconstruction methods with neural networks. The RNN is extended to a class of recurrent multilayer perceptrons (RMLPs) with latent nodes. Our methods contain two steps: the edge rank assignment step and the network construction step. The former assigns ranks to all possible edges by a recursive procedure based on the estimated weights of wires of RNN/RMLP (RE RNN /RE RMLP ), and the latter constructs a network consisting of top-ranked edges under which the optimized RNN simulates the gene expression time series. The particle swarm optimization (PSO) is applied to optimize the parameters of RNNs and RMLPs in a two-step algorithm. The proposed RE RNN -RNN and RE RMLP -RNN algorithms are tested on synthetic and experimental gene expression time series of small GRNs of about 10 genes. The experimental time series are from the studies of yeast cell cycle regulated genes and E. coli DNA repair genes. The unstable estimation of RNN using experimental time series having limited data points can lead to fairly arbitrary predicted GRNs. Our methods incorporate RNN and RMLP into a two-step structure learning procedure. Results show that the RE RMLP using the RMLP with a suitable number of latent nodes to reduce the parameter dimension often result in more accurate edge ranks than the RE RNN using the regularized RNN on short simulated time series. Combining by a weighted majority voting rule the networks derived by the RE RMLP -RNN using different numbers of latent nodes in step one to infer the GRN, the method performs consistently and outperforms published algorithms for GRN reconstruction on most benchmark time series. The framework of two-step algorithms can potentially incorporate with different nonlinear differential equation models to reconstruct the GRN.
Quantifying the Uncertainty in Discharge Data Using Hydraulic Knowledge and Uncertain Gaugings
NASA Astrophysics Data System (ADS)
Renard, B.; Le Coz, J.; Bonnifait, L.; Branger, F.; Le Boursicaud, R.; Horner, I.; Mansanarez, V.; Lang, M.
2014-12-01
River discharge is a crucial variable for Hydrology: as the output variable of most hydrologic models, it is used for sensitivity analyses, model structure identification, parameter estimation, data assimilation, prediction, etc. A major difficulty stems from the fact that river discharge is not measured continuously. Instead, discharge time series used by hydrologists are usually based on simple stage-discharge relations (rating curves) calibrated using a set of direct stage-discharge measurements (gaugings). In this presentation, we present a Bayesian approach to build such hydrometric rating curves, to estimate the associated uncertainty and to propagate this uncertainty to discharge time series. The three main steps of this approach are described: (1) Hydraulic analysis: identification of the hydraulic controls that govern the stage-discharge relation, identification of the rating curve equation and specification of prior distributions for the rating curve parameters; (2) Rating curve estimation: Bayesian inference of the rating curve parameters, accounting for the individual uncertainties of available gaugings, which often differ according to the discharge measurement procedure and the flow conditions; (3) Uncertainty propagation: quantification of the uncertainty in discharge time series, accounting for both the rating curve uncertainties and the uncertainty of recorded stage values. In addition, we also discuss current research activities, including the treatment of non-univocal stage-discharge relationships (e.g. due to hydraulic hysteresis, vegetation growth, sudden change of the geometry of the section, etc.).
Characterizing the impact of model error in hydrologic time series recovery inverse problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hansen, Scott K.; He, Jiachuan; Vesselinov, Velimir V.
Hydrologic models are commonly over-smoothed relative to reality, owing to computational limitations and to the difficulty of obtaining accurate high-resolution information. When used in an inversion context, such models may introduce systematic biases which cannot be encapsulated by an unbiased “observation noise” term of the type assumed by standard regularization theory and typical Bayesian formulations. Despite its importance, model error is difficult to encapsulate systematically and is often neglected. In this paper, model error is considered for an important class of inverse problems that includes interpretation of hydraulic transients and contaminant source history inference: reconstruction of a time series thatmore » has been convolved against a transfer function (i.e., impulse response) that is only approximately known. Using established harmonic theory along with two results established here regarding triangular Toeplitz matrices, upper and lower error bounds are derived for the effect of systematic model error on time series recovery for both well-determined and over-determined inverse problems. It is seen that use of additional measurement locations does not improve expected performance in the face of model error. A Monte Carlo study of a realistic hydraulic reconstruction problem is presented, and the lower error bound is seen informative about expected behavior. Finally, a possible diagnostic criterion for blind transfer function characterization is also uncovered.« less
Pedoinformatics Approach to Soil Text Analytics
NASA Astrophysics Data System (ADS)
Furey, J.; Seiter, J.; Davis, A.
2017-12-01
The several extant schema for the classification of soils rely on differing criteria, but the major soil science taxonomies, including the United States Department of Agriculture (USDA) and the international harmonized World Reference Base for Soil Resources systems, are based principally on inferred pedogenic properties. These taxonomies largely result from compiled individual observations of soil morphologies within soil profiles, and the vast majority of this pedologic information is contained in qualitative text descriptions. We present text mining analyses of hundreds of gigabytes of parsed text and other data in the digitally available USDA soil taxonomy documentation, the Soil Survey Geographic (SSURGO) database, and the National Cooperative Soil Survey (NCSS) soil characterization database. These analyses implemented iPython calls to Gensim modules for topic modelling, with latent semantic indexing completed down to the lowest taxon level (soil series) paragraphs. Via a custom extension of the Natural Language Toolkit (NLTK), approximately one percent of the USDA soil series descriptions were used to train a classifier for the remainder of the documents, essentially by treating soil science words as comprising a novel language. While location-specific descriptors at the soil series level are amenable to geomatics methods, unsupervised clustering of the occurrence of other soil science words did not closely follow the usual hierarchy of soil taxa. We present preliminary phrasal analyses that may account for some of these effects.
Using diurnal temperature signals to infer vertical groundwater-surface water exchange
Irvine, Dylan J.; Briggs, Martin A.; Lautz, Laura K.; Gordon, Ryan P.; McKenzie, Jeffrey M.; Cartwright, Ian
2017-01-01
Heat is a powerful tracer to quantify fluid exchange between surface water and groundwater. Temperature time series can be used to estimate pore water fluid flux, and techniques can be employed to extend these estimates to produce detailed plan-view flux maps. Key advantages of heat tracing include cost-effective sensors and ease of data collection and interpretation, without the need for expensive and time-consuming laboratory analyses or induced tracers. While the collection of temperature data in saturated sediments is relatively straightforward, several factors influence the reliability of flux estimates that are based on time series analysis (diurnal signals) of recorded temperatures. Sensor resolution and deployment are particularly important in obtaining robust flux estimates in upwelling conditions. Also, processing temperature time series data involves a sequence of complex steps, including filtering temperature signals, selection of appropriate thermal parameters, and selection of the optimal analytical solution for modeling. This review provides a synthesis of heat tracing using diurnal temperature oscillations, including details on optimal sensor selection and deployment, data processing, model parameterization, and an overview of computing tools available. Recent advances in diurnal temperature methods also provide the opportunity to determine local saturated thermal diffusivity, which can improve the accuracy of fluid flux modeling and sensor spacing, which is related to streambed scour and deposition. These parameters can also be used to determine the reliability of flux estimates from the use of heat as a tracer.
Interspecific competition in plants: how well do current methods answer fundamental questions?
Connolly, J; Wayne, P; Bazzaz, F A
2001-02-01
Accurately quantifying and interpreting the processes and outcomes of competition among plants is essential for evaluating theories of plant community organization and evolution. We argue that many current experimental approaches to quantifying competitive interactions introduce size bias, which may significantly impact the quantitative and qualitative conclusions drawn from studies. Size bias generally arises when estimates of competitive ability are erroneously influenced by the initial size of competing individuals. We employ a series of quantitative thought experiments to demonstrate the potential for size bias in analysis of four traditional experimental designs (pairwise, replacement series, additive series, and response surfaces) either when only final measurements are available or when both initial and final measurements are collected. We distinguish three questions relevant to describing competitive interactions: Which species dominates? Which species gains? and How do species affect each other? The choice of experimental design and measurements greatly influences the scope of inference permitted. Conditions under which the latter two questions can give biased information are tabulated. We outline a new approach to characterizing competition that avoids size bias and that improves the concordance between research question and experimental design. The implications of the choice of size metrics used to quantify both the initial state and the responses of elements in interspecific mixtures are discussed. The relevance of size bias in competition studies with organisms other than plants is also discussed.
Characterizing the impact of model error in hydrologic time series recovery inverse problems
Hansen, Scott K.; He, Jiachuan; Vesselinov, Velimir V.
2017-10-28
Hydrologic models are commonly over-smoothed relative to reality, owing to computational limitations and to the difficulty of obtaining accurate high-resolution information. When used in an inversion context, such models may introduce systematic biases which cannot be encapsulated by an unbiased “observation noise” term of the type assumed by standard regularization theory and typical Bayesian formulations. Despite its importance, model error is difficult to encapsulate systematically and is often neglected. In this paper, model error is considered for an important class of inverse problems that includes interpretation of hydraulic transients and contaminant source history inference: reconstruction of a time series thatmore » has been convolved against a transfer function (i.e., impulse response) that is only approximately known. Using established harmonic theory along with two results established here regarding triangular Toeplitz matrices, upper and lower error bounds are derived for the effect of systematic model error on time series recovery for both well-determined and over-determined inverse problems. It is seen that use of additional measurement locations does not improve expected performance in the face of model error. A Monte Carlo study of a realistic hydraulic reconstruction problem is presented, and the lower error bound is seen informative about expected behavior. Finally, a possible diagnostic criterion for blind transfer function characterization is also uncovered.« less
Evaluation of very long baseline interferometry atmospheric modeling improvements
NASA Technical Reports Server (NTRS)
Macmillan, D. S.; Ma, C.
1994-01-01
We determine the improvement in baseline length precision and accuracy using new atmospheric delay mapping functions and MTT by analyzing the NASA Crustal Dynamics Project research and development (R&D) experiments and the International Radio Interferometric Surveying (IRIS) A experiments. These mapping functions reduce baseline length scatter by about 20% below that using the CfA2.2 dry and Chao wet mapping functions. With the newer mapping functions, average station vertical scatter inferred from observed length precision (given by length repeatabilites) is 11.4 mm for the 1987-1990 monthly R&D series of experiments and 5.6 mm for the 3-week-long extended research and development experiment (ERDE) series. The inferred monthly R&D station vertical scatter is reduced by 2 mm or by 7 mm is a root-sum-square (rss) sense. Length repeatabilities are optimum when observations below a 7-8 deg elevation cutoff are removed from the geodetic solution. Analyses of IRIS-A data from 1984 through 1991 and the monthly R&D experiments both yielded a nonatmospheric unmodeled station vertical error or about 8 mm. In addition, analysis of the IRIS-A exeriments revealed systematic effects in the evolution of some baseline length measurements. The length rate of change has an apparent acceleration, and the length evolution has a quasi-annual signature. We show that the origin of these effects is unlikely to be related to atmospheric modeling errors. Rates of change of the transatlantic Westford-Wettzell and Richmond-Wettzell baseline lengths calculated from 1988 through 1991 agree with the NUVEL-1 plate motion model (Argus and Gordon, 1991) to within 1 mm/yr. Short-term (less than 90 days) variations of IRIS-A baseline length measurements contribute more than 90% of the observed scatter about a best fit line, and this short-term scatter has large variations on an annual time scale.
Inferring the nature of anthropogenic threats from long-term abundance records.
Shoemaker, Kevin T; Akçakaya, H Resit
2015-02-01
Diagnosing the processes that threaten species persistence is critical for recovery planning and risk forecasting. Dominant threats are typically inferred by experts on the basis of a patchwork of informal methods. Transparent, quantitative diagnostic tools would contribute much-needed consistency, objectivity, and rigor to the process of diagnosing anthropogenic threats. Long-term census records, available for an increasingly large and diverse set of taxa, may exhibit characteristic signatures of specific threatening processes and thereby provide information for threat diagnosis. We developed a flexible Bayesian framework for diagnosing threats on the basis of long-term census records and diverse ancillary sources of information. We tested this framework with simulated data from artificial populations subjected to varying degrees of exploitation and habitat loss and several real-world abundance time series for which threatening processes are relatively well understood: bluefin tuna (Thunnus maccoyii) and Atlantic cod (Gadus morhua) (exploitation) and Red Grouse (Lagopus lagopus scotica) and Eurasian Skylark (Alauda arvensis) (habitat loss). Our method correctly identified the process driving population decline for over 90% of time series simulated under moderate to severe threat scenarios. Successful identification of threats approached 100% for severe exploitation and habitat loss scenarios. Our method identified threats less successfully when threatening processes were weak and when populations were simultaneously affected by multiple threats. Our method selected the presumed true threat model for all real-world case studies, although results were somewhat ambiguous in the case of the Eurasian Skylark. In the latter case, incorporation of an ancillary source of information (records of land-use change) increased the weight assigned to the presumed true model from 70% to 92%, illustrating the value of the proposed framework in bringing diverse sources of information into a common rigorous framework. Ultimately, our framework may greatly assist conservation organizations in documenting threatening processes and planning species recovery. © 2014 Society for Conservation Biology.
NASA Astrophysics Data System (ADS)
Barandun, Martina; Huss, Matthias; Usubaliev, Ryskul; Azisov, Erlan; Berthier, Etienne; Kääb, Andreas; Bolch, Tobias; Hoelzle, Martin
2018-06-01
Glacier surface mass balance observations in the Tien Shan and Pamir are relatively sparse and often discontinuous. Nevertheless, glaciers are one of the most important components of the high-mountain cryosphere in the region as they strongly influence water availability in the arid, continental and intensely populated downstream areas. This study provides reliable and continuous surface mass balance series for selected glaciers located in the Tien Shan and Pamir-Alay. By cross-validating the results of three independent methods, we reconstructed the mass balance of the three benchmark glaciers, Abramov, Golubin and Glacier no. 354 for the past 2 decades. By applying different approaches, it was possible to compensate for the limitations and shortcomings of each individual method. This study proposes the use of transient snow line observations throughout the melt season obtained from satellite optical imagery and terrestrial automatic cameras. By combining modelling with remotely acquired information on summer snow depletion, it was possible to infer glacier mass changes for unmeasured years. The model is initialized with daily temperature and precipitation data collected at automatic weather stations in the vicinity of the glacier or with adjusted data from climate reanalysis products. Multi-annual mass changes based on high-resolution digital elevation models and in situ glaciological surveys were used to validate the results for the investigated glaciers. Substantial surface mass loss was confirmed for the three studied glaciers by all three methods, ranging from -0.30 ± 0.19 to -0.41 ± 0.33 m w.e. yr-1 over the 2004-2016 period. Our results indicate that integration of snow line observations into mass balance modelling significantly narrows the uncertainty ranges of the estimates. Hence, this highlights the potential of the methodology for application to unmonitored glaciers at larger scales for which no direct measurements are available.
NASA Astrophysics Data System (ADS)
Clarke, A. B.; Stephens, S.; Teasdale, R.; Sparks, R. S. J.; Diller, K.
2007-04-01
A series of 88 Vulcanian explosions occurred at the Soufrière Hills volcano, Montserrat, between August and October, 1997. Conduit conditions conducive to creating these and other Vulcanian explosions were explored via analysis of eruptive products and one-dimensional numerical modeling of magma ascent through a cylindrical conduit. The number densities and textures of plagioclase microlites were documented for twenty-three samples from the events. The natural samples all show very high number densities of microlites, and > 50% by number of microlites have areas < 20 μm 2. Pre-explosion conduit conditions and decompression history have been inferred from these data by comparison with experimental decompressions of similar groundmass compositions. Our comparisons suggest quench pressures < 30 MPa (origin depths < 2 km) and multiple rapid decompressions of > 13.75 MPa each during ascent from chamber to surface. Values are consistent with field studies of the same events and statistical analysis of explosion time-series data. The microlite volume number density trend with depth reveals an apparent transition from growth-dominated crystallization to nucleation-dominated crystallization at pressures of ˜ 7 MPa and lower. A concurrent sharp increase in bulk density marks the onset of significant open-system degassing, apparently due to a large increase in system permeability above ˜ 70% vesicularity. This open-system degassing results in a dense plug which eventually seals the conduit and forms conditions favorable to Vulcanian explosions. The corresponding inferred depth of overpressure at 250-700 m, near the base of the dense plug, is consistent with depth to center of pressure estimated from deformation measurements. Here we also illustrate that one-dimensional models representing ascent of a degassing, crystal-rich magma are broadly consistent with conduit profiles constructed via our petrologic analysis. The comparison between models and petrologic data suggests that the dense conduit plug forms as a result of high overpressure and open-system degassing through conduit walls.
Stochasticity of convection in Giga-LES data
NASA Astrophysics Data System (ADS)
De La Chevrotière, Michèle; Khouider, Boualem; Majda, Andrew J.
2016-09-01
The poor representation of tropical convection in general circulation models (GCMs) is believed to be responsible for much of the uncertainty in the predictions of weather and climate in the tropics. The stochastic multicloud model (SMCM) was recently developed by Khouider et al. (Commun Math Sci 8(1):187-216, 2010) to represent the missing variability in GCMs due to unresolved features of organized tropical convection. The SMCM is based on three cloud types (congestus, deep and stratiform), and transitions between these cloud types are formalized in terms of probability rules that are functions of the large-scale environment convective state and a set of seven arbitrary cloud timescale parameters. Here, a statistical inference method based on the Bayesian paradigm is applied to estimate these key cloud timescales from the Giga-LES dataset, a 24-h large-eddy simulation (LES) of deep tropical convection (Khairoutdinov et al. in J Adv Model Earth Syst 1(12), 2009) over a domain comparable to a GCM gridbox. A sequential learning strategy is used where the Giga-LES domain is partitioned into a few subdomains, and atmospheric time series obtained on each subdomain are used to train the Bayesian procedure incrementally. Convergence of the marginal posterior densities for all seven parameters is demonstrated for two different grid partitions, and sensitivity tests to other model parameters are also presented. A single column model simulation using the SMCM parameterization with the Giga-LES inferred parameters reproduces many important statistical features of the Giga-LES run, without any further tuning. In particular it exhibits intermittent dynamical behavior in both the stochastic cloud fractions and the large scale dynamics, with periods of dry phases followed by a coherent sequence of congestus, deep, and stratiform convection, varying on timescales of a few hours consistent with the Giga-LES time series. The chaotic variations of the cloud area fractions were captured fairly well both qualitatively and quantitatively demonstrating the stochastic nature of convection in the Giga-LES simulation.
Chen, Hua; Chen, Kun
2013-01-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n − An(t) follows a Poisson distribution, and as m → n, n(n−1)Tm/2N(0) follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference. PMID:23666939
Chen, Hua; Chen, Kun
2013-07-01
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\\left(n-1\\right){T}_{m}/2N\\left(0\\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.
Reverse engineering gene regulatory networks from measurement with missing values.
Ogundijo, Oyetunji E; Elmas, Abdulkadir; Wang, Xiaodong
2016-12-01
Gene expression time series data are usually in the form of high-dimensional arrays. Unfortunately, the data may sometimes contain missing values: for either the expression values of some genes at some time points or the entire expression values of a single time point or some sets of consecutive time points. This significantly affects the performance of many algorithms for gene expression analysis that take as an input, the complete matrix of gene expression measurement. For instance, previous works have shown that gene regulatory interactions can be estimated from the complete matrix of gene expression measurement. Yet, till date, few algorithms have been proposed for the inference of gene regulatory network from gene expression data with missing values. We describe a nonlinear dynamic stochastic model for the evolution of gene expression. The model captures the structural, dynamical, and the nonlinear natures of the underlying biomolecular systems. We present point-based Gaussian approximation (PBGA) filters for joint state and parameter estimation of the system with one-step or two-step missing measurements . The PBGA filters use Gaussian approximation and various quadrature rules, such as the unscented transform (UT), the third-degree cubature rule and the central difference rule for computing the related posteriors. The proposed algorithm is evaluated with satisfying results for synthetic networks, in silico networks released as a part of the DREAM project, and the real biological network, the in vivo reverse engineering and modeling assessment (IRMA) network of yeast Saccharomyces cerevisiae . PBGA filters are proposed to elucidate the underlying gene regulatory network (GRN) from time series gene expression data that contain missing values. In our state-space model, we proposed a measurement model that incorporates the effect of the missing data points into the sequential algorithm. This approach produces a better inference of the model parameters and hence, more accurate prediction of the underlying GRN compared to when using the conventional Gaussian approximation (GA) filters ignoring the missing data points.