ERIC Educational Resources Information Center
Ngan, Chun-Kit
2013-01-01
Making decisions over multivariate time series is an important topic which has gained significant interest in the past decade. A time series is a sequence of data points which are measured and ordered over uniform time intervals. A multivariate time series is a set of multiple, related time series in a particular domain in which domain experts…
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
Multivariate multiscale entropy of financial markets
NASA Astrophysics Data System (ADS)
Lu, Yunfan; Wang, Jun
2017-11-01
In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.
Multivariate Time Series Decomposition into Oscillation Components.
Matsuda, Takeru; Komaki, Fumiyasu
2017-08-01
Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Sornborger, Andrew T; Lauderdale, James D
2016-11-01
Neural data analysis has increasingly incorporated causal information to study circuit connectivity. Dimensional reduction forms the basis of most analyses of large multivariate time series. Here, we present a new, multitaper-based decomposition for stochastic, multivariate time series that acts on the covariance of the time series at all lags, C ( τ ), as opposed to standard methods that decompose the time series, X ( t ), using only information at zero-lag. In both simulated and neural imaging examples, we demonstrate that methods that neglect the full causal structure may be discarding important dynamical information in a time series.
Multivariate time series clustering on geophysical data recorded at Mt. Etna from 1996 to 2003
NASA Astrophysics Data System (ADS)
Di Salvo, Roberto; Montalto, Placido; Nunnari, Giuseppe; Neri, Marco; Puglisi, Giuseppe
2013-02-01
Time series clustering is an important task in data analysis issues in order to extract implicit, previously unknown, and potentially useful information from a large collection of data. Finding useful similar trends in multivariate time series represents a challenge in several areas including geophysics environment research. While traditional time series analysis methods deal only with univariate time series, multivariate time series analysis is a more suitable approach in the field of research where different kinds of data are available. Moreover, the conventional time series clustering techniques do not provide desired results for geophysical datasets due to the huge amount of data whose sampling rate is different according to the nature of signal. In this paper, a novel approach concerning geophysical multivariate time series clustering is proposed using dynamic time series segmentation and Self Organizing Maps techniques. This method allows finding coupling among trends of different geophysical data recorded from monitoring networks at Mt. Etna spanning from 1996 to 2003, when the transition from summit eruptions to flank eruptions occurred. This information can be used to carry out a more careful evaluation of the state of volcano and to define potential hazard assessment at Mt. Etna.
Time Series Model Identification by Estimating Information.
1982-11-01
principle, Applications of Statistics, P. R. Krishnaiah , ed., North-Holland: Amsterdam, 27-41. Anderson, T. W. (1971). The Statistical Analysis of Time Series...E. (1969). Multiple Time Series Modeling, Multivariate Analysis II, edited by P. Krishnaiah , Academic Press: New York, 389-409. Parzen, E. (1981...Newton, H. J. (1980). Multiple Time Series Modeling, II Multivariate Analysis - V, edited by P. Krishnaiah , North Holland: Amsterdam, 181-197. Shibata, R
Rotation in the Dynamic Factor Modeling of Multivariate Stationary Time Series.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
2001-01-01
Proposes a special rotation procedure for the exploratory dynamic factor model for stationary multivariate time series. The rotation procedure applies separately to each univariate component series of a q-variate latent factor series and transforms such a component, initially represented as white noise, into a univariate moving-average.…
A Sandwich-Type Standard Error Estimator of SEM Models with Multivariate Time Series
ERIC Educational Resources Information Center
Zhang, Guangjian; Chow, Sy-Miin; Ong, Anthony D.
2011-01-01
Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a…
Multivariate time series analysis of neuroscience data: some challenges and opportunities.
Pourahmadi, Mohsen; Noorbaloochi, Siamak
2016-04-01
Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multivariate stochastic analysis for Monthly hydrological time series at Cuyahoga River Basin
NASA Astrophysics Data System (ADS)
zhang, L.
2011-12-01
Copula has become a very powerful statistic and stochastic methodology in case of the multivariate analysis in Environmental and Water resources Engineering. In recent years, the popular one-parameter Archimedean copulas, e.g. Gumbel-Houggard copula, Cook-Johnson copula, Frank copula, the meta-elliptical copula, e.g. Gaussian Copula, Student-T copula, etc. have been applied in multivariate hydrological analyses, e.g. multivariate rainfall (rainfall intensity, duration and depth), flood (peak discharge, duration and volume), and drought analyses (drought length, mean and minimum SPI values, and drought mean areal extent). Copula has also been applied in the flood frequency analysis at the confluences of river systems by taking into account the dependence among upstream gauge stations rather than by using the hydrological routing technique. In most of the studies above, the annual time series have been considered as stationary signal which the time series have been assumed as independent identically distributed (i.i.d.) random variables. But in reality, hydrological time series, especially the daily and monthly hydrological time series, cannot be considered as i.i.d. random variables due to the periodicity existed in the data structure. Also, the stationary assumption is also under question due to the Climate Change and Land Use and Land Cover (LULC) change in the fast years. To this end, it is necessary to revaluate the classic approach for the study of hydrological time series by relaxing the stationary assumption by the use of nonstationary approach. Also as to the study of the dependence structure for the hydrological time series, the assumption of same type of univariate distribution also needs to be relaxed by adopting the copula theory. In this paper, the univariate monthly hydrological time series will be studied through the nonstationary time series analysis approach. The dependence structure of the multivariate monthly hydrological time series will be studied through the copula theory. As to the parameter estimation, the maximum likelihood estimation (MLE) will be applied. To illustrate the method, the univariate time series model and the dependence structure will be determined and tested using the monthly discharge time series of Cuyahoga River Basin.
Network structure of multivariate time series.
Lacasa, Lucas; Nicosia, Vincenzo; Latora, Vito
2015-10-21
Our understanding of a variety of phenomena in physics, biology and economics crucially depends on the analysis of multivariate time series. While a wide range tools and techniques for time series analysis already exist, the increasing availability of massive data structures calls for new approaches for multidimensional signal processing. We present here a non-parametric method to analyse multivariate time series, based on the mapping of a multidimensional time series into a multilayer network, which allows to extract information on a high dimensional dynamical system through the analysis of the structure of the associated multiplex network. The method is simple to implement, general, scalable, does not require ad hoc phase space partitioning, and is thus suitable for the analysis of large, heterogeneous and non-stationary time series. We show that simple structural descriptors of the associated multiplex networks allow to extract and quantify nontrivial properties of coupled chaotic maps, including the transition between different dynamical phases and the onset of various types of synchronization. As a concrete example we then study financial time series, showing that a multiplex network analysis can efficiently discriminate crises from periods of financial stability, where standard methods based on time-series symbolization often fail.
Visualizing frequent patterns in large multivariate time series
NASA Astrophysics Data System (ADS)
Hao, M.; Marwah, M.; Janetzko, H.; Sharma, R.; Keim, D. A.; Dayal, U.; Patnaik, D.; Ramakrishnan, N.
2011-01-01
The detection of previously unknown, frequently occurring patterns in time series, often called motifs, has been recognized as an important task. However, it is difficult to discover and visualize these motifs as their numbers increase, especially in large multivariate time series. To find frequent motifs, we use several temporal data mining and event encoding techniques to cluster and convert a multivariate time series to a sequence of events. Then we quantify the efficiency of the discovered motifs by linking them with a performance metric. To visualize frequent patterns in a large time series with potentially hundreds of nested motifs on a single display, we introduce three novel visual analytics methods: (1) motif layout, using colored rectangles for visualizing the occurrences and hierarchical relationships of motifs in a multivariate time series, (2) motif distortion, for enlarging or shrinking motifs as appropriate for easy analysis and (3) motif merging, to combine a number of identical adjacent motif instances without cluttering the display. Analysts can interactively optimize the degree of distortion and merging to get the best possible view. A specific motif (e.g., the most efficient or least efficient motif) can be quickly detected from a large time series for further investigation. We have applied these methods to two real-world data sets: data center cooling and oil well production. The results provide important new insights into the recurring patterns.
A multivariate time series approach to modeling and forecasting demand in the emergency department.
Jones, Spencer S; Evans, R Scott; Allen, Todd L; Thomas, Alun; Haug, Peter J; Welch, Shari J; Snow, Gregory L
2009-02-01
The goals of this investigation were to study the temporal relationships between the demands for key resources in the emergency department (ED) and the inpatient hospital, and to develop multivariate forecasting models. Hourly data were collected from three diverse hospitals for the year 2006. Descriptive analysis and model fitting were carried out using graphical and multivariate time series methods. Multivariate models were compared to a univariate benchmark model in terms of their ability to provide out-of-sample forecasts of ED census and the demands for diagnostic resources. Descriptive analyses revealed little temporal interaction between the demand for inpatient resources and the demand for ED resources at the facilities considered. Multivariate models provided more accurate forecasts of ED census and of the demands for diagnostic resources. Our results suggest that multivariate time series models can be used to reliably forecast ED patient census; however, forecasts of the demands for diagnostic resources were not sufficiently reliable to be useful in the clinical setting.
Multivariable nonlinear analysis of foreign exchange rates
NASA Astrophysics Data System (ADS)
Suzuki, Tomoya; Ikeguchi, Tohru; Suzuki, Masuo
2003-05-01
We analyze the multivariable time series of foreign exchange rates. These are price movements that have often been analyzed, and dealing time intervals and spreads between bid and ask prices. Considering dealing time intervals as event timing such as neurons’ firings, we use raster plots (RPs) and peri-stimulus time histograms (PSTHs) which are popular methods in the field of neurophysiology. Introducing special processings to obtaining RPs and PSTHs time histograms for analyzing exchange rates time series, we discover that there exists dynamical interaction among three variables. We also find that adopting multivariables leads to improvements of prediction accuracy.
Reconstructing multi-mode networks from multivariate time series
NASA Astrophysics Data System (ADS)
Gao, Zhong-Ke; Yang, Yu-Xuan; Dang, Wei-Dong; Cai, Qing; Wang, Zhen; Marwan, Norbert; Boccaletti, Stefano; Kurths, Jürgen
2017-09-01
Unveiling the dynamics hidden in multivariate time series is a task of the utmost importance in a broad variety of areas in physics. We here propose a method that leads to the construction of a novel functional network, a multi-mode weighted graph combined with an empirical mode decomposition, and to the realization of multi-information fusion of multivariate time series. The method is illustrated in a couple of successful applications (a multi-phase flow and an epileptic electro-encephalogram), which demonstrate its powerfulness in revealing the dynamical behaviors underlying the transitions of different flow patterns, and enabling to differentiate brain states of seizure and non-seizure.
Multiple Indicator Stationary Time Series Models.
ERIC Educational Resources Information Center
Sivo, Stephen A.
2001-01-01
Discusses the propriety and practical advantages of specifying multivariate time series models in the context of structural equation modeling for time series and longitudinal panel data. For time series data, the multiple indicator model specification improves on classical time series analysis. For panel data, the multiple indicator model…
A Method for Comparing Multivariate Time Series with Different Dimensions
Tapinos, Avraam; Mendes, Pedro
2013-01-01
In many situations it is desirable to compare dynamical systems based on their behavior. Similarity of behavior often implies similarity of internal mechanisms or dependency on common extrinsic factors. While there are widely used methods for comparing univariate time series, most dynamical systems are characterized by multivariate time series. Yet, comparison of multivariate time series has been limited to cases where they share a common dimensionality. A semi-metric is a distance function that has the properties of non-negativity, symmetry and reflexivity, but not sub-additivity. Here we develop a semi-metric – SMETS – that can be used for comparing groups of time series that may have different dimensions. To demonstrate its utility, the method is applied to dynamic models of biochemical networks and to portfolios of shares. The former is an example of a case where the dependencies between system variables are known, while in the latter the system is treated (and behaves) as a black box. PMID:23393554
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Constructing networks from a dynamical system perspective for multivariate nonlinear time series.
Nakamura, Tomomichi; Tanizawa, Toshihiro; Small, Michael
2016-03-01
We describe a method for constructing networks for multivariate nonlinear time series. We approach the interaction between the various scalar time series from a deterministic dynamical system perspective and provide a generic and algorithmic test for whether the interaction between two measured time series is statistically significant. The method can be applied even when the data exhibit no obvious qualitative similarity: a situation in which the naive method utilizing the cross correlation function directly cannot correctly identify connectivity. To establish the connectivity between nodes we apply the previously proposed small-shuffle surrogate (SSS) method, which can investigate whether there are correlation structures in short-term variabilities (irregular fluctuations) between two data sets from the viewpoint of deterministic dynamical systems. The procedure to construct networks based on this idea is composed of three steps: (i) each time series is considered as a basic node of a network, (ii) the SSS method is applied to verify the connectivity between each pair of time series taken from the whole multivariate time series, and (iii) the pair of nodes is connected with an undirected edge when the null hypothesis cannot be rejected. The network constructed by the proposed method indicates the intrinsic (essential) connectivity of the elements included in the system or the underlying (assumed) system. The method is demonstrated for numerical data sets generated by known systems and applied to several experimental time series.
A time-series approach to dynamical systems from classical and quantum worlds
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fossion, Ruben
2014-01-08
This contribution discusses some recent applications of time-series analysis in Random Matrix Theory (RMT), and applications of RMT in the statistial analysis of eigenspectra of correlation matrices of multivariate time series.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2015-09-14
This package contains statistical routines for extracting features from multivariate time-series data which can then be used for subsequent multivariate statistical analysis to identify patterns and anomalous behavior. It calculates local linear or quadratic regression model fits to moving windows for each series and then summarizes the model coefficients across user-defined time intervals for each series. These methods are domain agnostic-but they have been successfully applied to a variety of domains, including commercial aviation and electric power grid data.
Normalization methods in time series of platelet function assays
Van Poucke, Sven; Zhang, Zhongheng; Roest, Mark; Vukicevic, Milan; Beran, Maud; Lauwereins, Bart; Zheng, Ming-Hua; Henskens, Yvonne; Lancé, Marcus; Marcus, Abraham
2016-01-01
Abstract Platelet function can be quantitatively assessed by specific assays such as light-transmission aggregometry, multiple-electrode aggregometry measuring the response to adenosine diphosphate (ADP), arachidonic acid, collagen, and thrombin-receptor activating peptide and viscoelastic tests such as rotational thromboelastometry (ROTEM). The task of extracting meaningful statistical and clinical information from high-dimensional data spaces in temporal multivariate clinical data represented in multivariate time series is complex. Building insightful visualizations for multivariate time series demands adequate usage of normalization techniques. In this article, various methods for data normalization (z-transformation, range transformation, proportion transformation, and interquartile range) are presented and visualized discussing the most suited approach for platelet function data series. Normalization was calculated per assay (test) for all time points and per time point for all tests. Interquartile range, range transformation, and z-transformation demonstrated the correlation as calculated by the Spearman correlation test, when normalized per assay (test) for all time points. When normalizing per time point for all tests, no correlation could be abstracted from the charts as was the case when using all data as 1 dataset for normalization. PMID:27428217
Piecewise multivariate modelling of sequential metabolic profiling data.
Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan
2008-02-19
Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Mining Recent Temporal Patterns for Event Detection in Multivariate Time Series Data
Batal, Iyad; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos
2015-01-01
Improving the performance of classifiers using pattern mining techniques has been an active topic of data mining research. In this work we introduce the recent temporal pattern mining framework for finding predictive patterns for monitoring and event detection problems in complex multivariate time series data. This framework first converts time series into time-interval sequences of temporal abstractions. It then constructs more complex temporal patterns backwards in time using temporal operators. We apply our framework to health care data of 13,558 diabetic patients and show its benefits by efficiently finding useful patterns for detecting and diagnosing adverse medical conditions that are associated with diabetes. PMID:25937993
Demanuele, Charmaine; Bähner, Florian; Plichta, Michael M; Kirsch, Peter; Tost, Heike; Meyer-Lindenberg, Andreas; Durstewitz, Daniel
2015-01-01
Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from functional magnetic resonance imaging (fMRI) blood oxygenation level dependent (BOLD) time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze (RAM) task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC), but not in the primary visual cortex (V1). Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel activity.
Causality networks from multivariate time series and application to epilepsy.
Siggiridou, Elsa; Koutlis, Christos; Tsimpiris, Alkiviadis; Kimiskidis, Vasilios K; Kugiumtzis, Dimitris
2015-08-01
Granger causality and variants of this concept allow the study of complex dynamical systems as networks constructed from multivariate time series. In this work, a large number of Granger causality measures used to form causality networks from multivariate time series are assessed. For this, realizations on high dimensional coupled dynamical systems are considered and the performance of the Granger causality measures is evaluated, seeking for the measures that form networks closest to the true network of the dynamical system. In particular, the comparison focuses on Granger causality measures that reduce the state space dimension when many variables are observed. Further, the linear and nonlinear Granger causality measures of dimension reduction are compared to a standard Granger causality measure on electroencephalographic (EEG) recordings containing episodes of epileptiform discharges.
The Fourier decomposition method for nonlinear and non-stationary time series analysis.
Singh, Pushpendra; Joshi, Shiv Dutt; Patney, Rakesh Kumar; Saha, Kaushik
2017-03-01
for many decades, there has been a general perception in the literature that Fourier methods are not suitable for the analysis of nonlinear and non-stationary data. In this paper, we propose a novel and adaptive Fourier decomposition method (FDM), based on the Fourier theory, and demonstrate its efficacy for the analysis of nonlinear and non-stationary time series. The proposed FDM decomposes any data into a small number of 'Fourier intrinsic band functions' (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and variable frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank-based multivariate FDM (MFDM), for the analysis of multivariate nonlinear and non-stationary time series, using the FDM. We also present an algorithm to obtain cut-off frequencies for MFDM. The proposed MFDM generates a finite number of band-limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods provide a time-frequency-energy (TFE) distribution that reveals the intrinsic structure of a data. Numerical computations and simulations have been carried out and comparison is made with the empirical mode decomposition algorithms.
Liu, Siwei; Molenaar, Peter C M
2014-12-01
This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. We conducted a simulation study to compare iVAR with three methods for handling missing data: listwise deletion, imputation with sample means and variances, and multiple imputation ignoring time dependency. The results showed that iVAR produces better estimates for the cross-lagged coefficients than do the other three methods. We demonstrate the use of iVAR with an empirical example of time series electrodermal activity data and discuss the advantages and limitations of the program.
Advanced spectral methods for climatic time series
Ghil, M.; Allen, M.R.; Dettinger, M.D.; Ide, K.; Kondrashov, D.; Mann, M.E.; Robertson, A.W.; Saunders, A.; Tian, Y.; Varadi, F.; Yiou, P.
2002-01-01
The analysis of univariate or multivariate time series provides crucial information to describe, understand, and predict climatic variability. The discovery and implementation of a number of novel methods for extracting useful information from time series has recently revitalized this classical field of study. Considerable progress has also been made in interpreting the information so obtained in terms of dynamical systems theory. In this review we describe the connections between time series analysis and nonlinear dynamics, discuss signal- to-noise enhancement, and present some of the novel methods for spectral analysis. The various steps, as well as the advantages and disadvantages of these methods, are illustrated by their application to an important climatic time series, the Southern Oscillation Index. This index captures major features of interannual climate variability and is used extensively in its prediction. Regional and global sea surface temperature data sets are used to illustrate multivariate spectral methods. Open questions and further prospects conclude the review.
Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network
Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong; ...
2017-12-18
Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
The Fourier decomposition method for nonlinear and non-stationary time series analysis
Joshi, Shiv Dutt; Patney, Rakesh Kumar; Saha, Kaushik
2017-01-01
for many decades, there has been a general perception in the literature that Fourier methods are not suitable for the analysis of nonlinear and non-stationary data. In this paper, we propose a novel and adaptive Fourier decomposition method (FDM), based on the Fourier theory, and demonstrate its efficacy for the analysis of nonlinear and non-stationary time series. The proposed FDM decomposes any data into a small number of ‘Fourier intrinsic band functions’ (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and variable frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank-based multivariate FDM (MFDM), for the analysis of multivariate nonlinear and non-stationary time series, using the FDM. We also present an algorithm to obtain cut-off frequencies for MFDM. The proposed MFDM generates a finite number of band-limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods provide a time–frequency–energy (TFE) distribution that reveals the intrinsic structure of a data. Numerical computations and simulations have been carried out and comparison is made with the empirical mode decomposition algorithms. PMID:28413352
Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong
Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
Kernel canonical-correlation Granger causality for multiple time series
NASA Astrophysics Data System (ADS)
Wu, Guorong; Duan, Xujun; Liao, Wei; Gao, Qing; Chen, Huafu
2011-04-01
Canonical-correlation analysis as a multivariate statistical technique has been applied to multivariate Granger causality analysis to infer information flow in complex systems. It shows unique appeal and great superiority over the traditional vector autoregressive method, due to the simplified procedure that detects causal interaction between multiple time series, and the avoidance of potential model estimation problems. However, it is limited to the linear case. Here, we extend the framework of canonical correlation to include the estimation of multivariate nonlinear Granger causality for drawing inference about directed interaction. Its feasibility and effectiveness are verified on simulated data.
Fontes, Cristiano Hora; Budman, Hector
2017-11-01
A clustering problem involving multivariate time series (MTS) requires the selection of similarity metrics. This paper shows the limitations of the PCA similarity factor (SPCA) as a single metric in nonlinear problems where there are differences in magnitude of the same process variables due to expected changes in operation conditions. A novel method for clustering MTS based on a combination between SPCA and the average-based Euclidean distance (AED) within a fuzzy clustering approach is proposed. Case studies involving either simulated or real industrial data collected from a large scale gas turbine are used to illustrate that the hybrid approach enhances the ability to recognize normal and fault operating patterns. This paper also proposes an oversampling procedure to create synthetic multivariate time series that can be useful in commonly occurring situations involving unbalanced data sets. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Faes, Luca; Nollo, Giandomenico; Porta, Alberto
2012-03-01
The complexity of the short-term cardiovascular control prompts for the introduction of multivariate (MV) nonlinear time series analysis methods to assess directional interactions reflecting the underlying regulatory mechanisms. This study introduces a new approach for the detection of nonlinear Granger causality in MV time series, based on embedding the series by a sequential, non-uniform procedure, and on estimating the information flow from one series to another by means of the corrected conditional entropy. The approach is validated on short realizations of linear stochastic and nonlinear deterministic processes, and then evaluated on heart period, systolic arterial pressure and respiration variability series measured from healthy humans in the resting supine position and in the upright position after head-up tilt. Copyright © 2011 Elsevier Ltd. All rights reserved.
Analyzing developmental processes on an individual level using nonstationary time series modeling.
Molenaar, Peter C M; Sinclair, Katerina O; Rovine, Michael J; Ram, Nilam; Corneal, Sherry E
2009-01-01
Individuals change over time, often in complex ways. Generally, studies of change over time have combined individuals into groups for analysis, which is inappropriate in most, if not all, studies of development. The authors explain how to identify appropriate levels of analysis (individual vs. group) and demonstrate how to estimate changes in developmental processes over time using a multivariate nonstationary time series model. They apply this model to describe the changing relationships between a biological son and father and a stepson and stepfather at the individual level. The authors also explain how to use an extended Kalman filter with iteration and smoothing estimator to capture how dynamics change over time. Finally, they suggest further applications of the multivariate nonstationary time series model and detail the next steps in the development of statistical models used to analyze individual-level data.
Clustering Multivariate Time Series Using Hidden Markov Models
Ghassempour, Shima; Girosi, Federico; Maeder, Anthony
2014-01-01
In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs), where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers. PMID:24662996
Some Recent Developments on Complex Multivariate Distributions
ERIC Educational Resources Information Center
Krishnaiah, P. R.
1976-01-01
In this paper, the author gives a review of the literature on complex multivariate distributions. Some new results on these distributions are also given. Finally, the author discusses the applications of the complex multivariate distributions in the area of the inference on multiple time series. (Author)
Recurrent Neural Networks for Multivariate Time Series with Missing Values.
Che, Zhengping; Purushotham, Sanjay; Cho, Kyunghyun; Sontag, David; Liu, Yan
2018-04-17
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
NASA Astrophysics Data System (ADS)
Chorozoglou, D.; Kugiumtzis, D.; Papadimitriou, E.
2018-06-01
The seismic hazard assessment in the area of Greece is attempted by studying the earthquake network structure, such as small-world and random. In this network, a node represents a seismic zone in the study area and a connection between two nodes is given by the correlation of the seismic activity of two zones. To investigate the network structure, and particularly the small-world property, the earthquake correlation network is compared with randomized ones. Simulations on multivariate time series of different length and number of variables show that for the construction of randomized networks the method randomizing the time series performs better than methods randomizing directly the original network connections. Based on the appropriate randomization method, the network approach is applied to time series of earthquakes that occurred between main shocks in the territory of Greece spanning the period 1999-2015. The characterization of networks on sliding time windows revealed that small-world structure emerges in the last time interval, shortly before the main shock.
2009-12-18
cannot be detected with univariate techniques, but require multivariate analysis instead (Kamitani and Tong [2005]). Two other time series analysis ...learning for time series analysis . The historical record of DBNs can be traced back to Dean and Kanazawa [1988] and Dean and Wellman [1991], with...Rev. 8-98) Prescribed by ANSI Std Z39-18 Keywords: Hidden Process Models, probabilistic time series modeling, functional Magnetic Resonance Imaging
Zhang, Fang; Wagner, Anita K; Soumerai, Stephen B; Ross-Degnan, Dennis
2009-02-01
Interrupted time series (ITS) is a strong quasi-experimental research design, which is increasingly applied to estimate the effects of health services and policy interventions. We describe and illustrate two methods for estimating confidence intervals (CIs) around absolute and relative changes in outcomes calculated from segmented regression parameter estimates. We used multivariate delta and bootstrapping methods (BMs) to construct CIs around relative changes in level and trend, and around absolute changes in outcome based on segmented linear regression analyses of time series data corrected for autocorrelated errors. Using previously published time series data, we estimated CIs around the effect of prescription alerts for interacting medications with warfarin on the rate of prescriptions per 10,000 warfarin users per month. Both the multivariate delta method (MDM) and the BM produced similar results. BM is preferred for calculating CIs of relative changes in outcomes of time series studies, because it does not require large sample sizes when parameter estimates are obtained correctly from the model. Caution is needed when sample size is small.
Liu, Zitao; Hauskrecht, Milos
2017-11-01
Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is not straightforward. First, patient-specific variations are typically large and population-based models derived or learned from many different patients are often unable to support accurate predictions for each individual patient. Moreover, time series observed for one patient at any point in time may be too short and insufficient to learn a high-quality patient-specific model just from the patient's own data. To address these problems we propose, develop and experiment with a new adaptive forecasting framework for building multivariate clinical time series models for a patient and for supporting patient-specific predictions. The framework relies on the adaptive model switching approach that at any point in time selects the most promising time series model out of the pool of many possible models, and consequently, combines advantages of the population, patient-specific and short-term individualized predictive models. We demonstrate that the adaptive model switching framework is very promising approach to support personalized time series prediction, and that it is able to outperform predictions based on pure population and patient-specific models, as well as, other patient-specific model adaptation strategies.
Large-scale Granger causality analysis on resting-state functional MRI
NASA Astrophysics Data System (ADS)
D'Souza, Adora M.; Abidin, Anas Zainul; Leistritz, Lutz; Wismüller, Axel
2016-03-01
We demonstrate an approach to measure the information flow between each pair of time series in resting-state functional MRI (fMRI) data of the human brain and subsequently recover its underlying network structure. By integrating dimensionality reduction into predictive time series modeling, large-scale Granger Causality (lsGC) analysis method can reveal directed information flow suggestive of causal influence at an individual voxel level, unlike other multivariate approaches. This method quantifies the influence each voxel time series has on every other voxel time series in a multivariate sense and hence contains information about the underlying dynamics of the whole system, which can be used to reveal functionally connected networks within the brain. To identify such networks, we perform non-metric network clustering, such as accomplished by the Louvain method. We demonstrate the effectiveness of our approach to recover the motor and visual cortex from resting state human brain fMRI data and compare it with the network recovered from a visuomotor stimulation experiment, where the similarity is measured by the Dice Coefficient (DC). The best DC obtained was 0.59 implying a strong agreement between the two networks. In addition, we thoroughly study the effect of dimensionality reduction in lsGC analysis on network recovery. We conclude that our approach is capable of detecting causal influence between time series in a multivariate sense, which can be used to segment functionally connected networks in the resting-state fMRI.
Moran, John L; Solomon, Patricia J
2011-02-01
Time series analysis has seen limited application in the biomedical Literature. The utility of conventional and advanced time series estimators was explored for intensive care unit (ICU) outcome series. Monthly mean time series, 1993-2006, for hospital mortality, severity-of-illness score (APACHE III), ventilation fraction and patient type (medical and surgical), were generated from the Australia and New Zealand Intensive Care Society adult patient database. Analyses encompassed geographical seasonal mortality patterns, series structural time changes, mortality series volatility using autoregressive moving average and Generalized Autoregressive Conditional Heteroscedasticity models in which predicted variances are updated adaptively, and bivariate and multivariate (vector error correction models) cointegrating relationships between series. The mortality series exhibited marked seasonality, declining mortality trend and substantial autocorrelation beyond 24 lags. Mortality increased in winter months (July-August); the medical series featured annual cycling, whereas the surgical demonstrated long and short (3-4 months) cycling. Series structural breaks were apparent in January 1995 and December 2002. The covariance stationary first-differenced mortality series was consistent with a seasonal autoregressive moving average process; the observed conditional-variance volatility (1993-1995) and residual Autoregressive Conditional Heteroscedasticity effects entailed a Generalized Autoregressive Conditional Heteroscedasticity model, preferred by information criterion and mean model forecast performance. Bivariate cointegration, indicating long-term equilibrium relationships, was established between mortality and severity-of-illness scores at the database level and for categories of ICUs. Multivariate cointegration was demonstrated for {log APACHE III score, log ICU length of stay, ICU mortality and ventilation fraction}. A system approach to understanding series time-dependence may be established using conventional and advanced econometric time series estimators. © 2010 Blackwell Publishing Ltd.
van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge
2018-04-26
We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
Dynamic Factor Analysis of Nonstationary Multivariate Time Series.
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; And Others
1992-01-01
The dynamic factor model proposed by P. C. Molenaar (1985) is exhibited, and a dynamic nonstationary factor model (DNFM) is constructed with latent factor series that have time-varying mean functions. The use of a DNFM is illustrated using data from a television viewing habits study. (SLD)
FBST for Cointegration Problems
NASA Astrophysics Data System (ADS)
Diniz, M.; Pereira, C. A. B.; Stern, J. M.
2008-11-01
In order to estimate causal relations, the time series econometrics has to be aware of spurious correlation, a problem first mentioned by Yule [21]. To solve the problem, one can work with differenced series or use multivariate models like VAR or VEC models. In this case, the analysed series are going to present a long run relation i.e. a cointegration relation. Even though the Bayesian literature about inference on VAR/VEC models is quite advanced, Bauwens et al. [2] highlight that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results." This paper presents the Full Bayesian Significance Test applied to cointegration rank selection tests in multivariate (VAR/VEC) time series models and shows how to implement it using available in the literature and simulated data sets. A standard non-informative prior is assumed.
Multivariable Time Series Prediction for the Icing Process on Overhead Power Transmission Line
Li, Peng; Zhao, Na; Zhou, Donghua; Cao, Min; Li, Jingjie; Shi, Xinling
2014-01-01
The design of monitoring and predictive alarm systems is necessary for successful overhead power transmission line icing. Given the characteristics of complexity, nonlinearity, and fitfulness in the line icing process, a model based on a multivariable time series is presented here to predict the icing load of a transmission line. In this model, the time effects of micrometeorology parameters for the icing process have been analyzed. The phase-space reconstruction theory and machine learning method were then applied to establish the prediction model, which fully utilized the history of multivariable time series data in local monitoring systems to represent the mapping relationship between icing load and micrometeorology factors. Relevant to the characteristic of fitfulness in line icing, the simulations were carried out during the same icing process or different process to test the model's prediction precision and robustness. According to the simulation results for the Tao-Luo-Xiong Transmission Line, this model demonstrates a good accuracy of prediction in different process, if the prediction length is less than two hours, and would be helpful for power grid departments when deciding to take action in advance to address potential icing disasters. PMID:25136653
Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob
2016-08-01
The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
A comparison between MS-VECM and MS-VECMX on economic time series data
NASA Astrophysics Data System (ADS)
Phoong, Seuk-Wai; Ismail, Mohd Tahir; Sek, Siok-Kun
2014-07-01
Multivariate Markov switching models able to provide useful information on the study of structural change data since the regime switching model can analyze the time varying data and capture the mean and variance in the series of dependence structure. This paper will investigates the oil price and gold price effects on Malaysia, Singapore, Thailand and Indonesia stock market returns. Two forms of Multivariate Markov switching models are used namely the mean adjusted heteroskedasticity Markov Switching Vector Error Correction Model (MSMH-VECM) and the mean adjusted heteroskedasticity Markov Switching Vector Error Correction Model with exogenous variable (MSMH-VECMX). The reason for using these two models are to capture the transition probabilities of the data since real financial time series data always exhibit nonlinear properties such as regime switching, cointegrating relations, jumps or breaks passing the time. A comparison between these two models indicates that MSMH-VECM model able to fit the time series data better than the MSMH-VECMX model. In addition, it was found that oil price and gold price affected the stock market changes in the four selected countries.
Moving Average Models with Bivariate Exponential and Geometric Distributions.
1985-03-01
ordinary time series and of point processes. Developments in Statistics, Vol. 1, P.R. Krishnaiah , ed. Academic Press, New York. [9] Esary, J.D. and...valued and discrete - valued time series with ARMA correlation structure. Multivariate Analysis V, P.R. Krishnaiah , ed. North-Holland. 151-166. [28
Most analyses of daily time series epidemiology data relate mortality or morbidity counts to PM and other air pollutants by means of single-outcome regression models using multiple predictors, without taking into account the complex statistical structure of the predictor variable...
Time Series Model Identification by Estimating Information, Memory, and Quantiles.
1983-07-01
Standards, Sect. D, 68D, 937-951. Parzen, Emanuel (1969) "Multiple time series modeling" Multivariate Analysis - II, edited by P. Krishnaiah , Academic... Krishnaiah , North Holland: Amsterdam, 283-295. Parzen, Emanuel (1979) "Forecasting and Whitening Filter Estimation" TIMS Studies in the Management...principle. Applications of Statistics, P. R. Krishnaiah , ed. North Holland: Amsterdam, 27-41. Box, G. E. P. and Jenkins, G. M. (1970) Time Series Analysis
Vial, Flavie; Wei, Wei; Held, Leonhard
2016-12-20
In an era of ubiquitous electronic collection of animal health data, multivariate surveillance systems (which concurrently monitor several data streams) should have a greater probability of detecting disease events than univariate systems. However, despite their limitations, univariate aberration detection algorithms are used in most active syndromic surveillance (SyS) systems because of their ease of application and interpretation. On the other hand, a stochastic modelling-based approach to multivariate surveillance offers more flexibility, allowing for the retention of historical outbreaks, for overdispersion and for non-stationarity. While such methods are not new, they are yet to be applied to animal health surveillance data. We applied an example of such stochastic model, Held and colleagues' two-component model, to two multivariate animal health datasets from Switzerland. In our first application, multivariate time series of the number of laboratories test requests were derived from Swiss animal diagnostic laboratories. We compare the performance of the two-component model to parallel monitoring using an improved Farrington algorithm and found both methods yield a satisfactorily low false alarm rate. However, the calibration test of the two-component model on the one-step ahead predictions proved satisfactory, making such an approach suitable for outbreak prediction. In our second application, the two-component model was applied to the multivariate time series of the number of cattle abortions and the number of test requests for bovine viral diarrhea (a disease that often results in abortions). We found that there is a two days lagged effect from the number of abortions to the number of test requests. We further compared the joint modelling and univariate modelling of the number of laboratory test requests time series. The joint modelling approach showed evidence of superiority in terms of forecasting abilities. Stochastic modelling approaches offer the potential to address more realistic surveillance scenarios through, for example, the inclusion of times series specific parameters, or of covariates known to have an impact on syndrome counts. Nevertheless, many methodological challenges to multivariate surveillance of animal SyS data still remain. Deciding on the amount of corroboration among data streams that is required to escalate into an alert is not a trivial task given the sparse data on the events under consideration (e.g. disease outbreaks).
Detecting a currency’s dominance using multivariate time series analysis
NASA Astrophysics Data System (ADS)
Syahidah Yusoff, Nur; Sharif, Shamshuritawati
2017-09-01
A currency exchange rate is the price of one country’s currency in terms of another country’s currency. There are four different prices; opening, closing, highest, and lowest can be achieved from daily trading activities. In the past, a lot of studies have been carried out by using closing price only. However, those four prices are interrelated to each other. Thus, the multivariate time series can provide more information than univariate time series. Therefore, the enthusiasm of this paper is to compare the results of two different approaches, which are mean vector and Escoufier’s RV coefficient in constructing similarity matrices of 20 world currencies. Consequently, both matrices are used to substitute the correlation matrix required by network topology. With the help of degree centrality measure, we can detect the currency’s dominance for both networks. The pros and cons for both approaches will be presented at the end of this paper.
Grootswagers, Tijl; Wardle, Susan G; Carlson, Thomas A
2017-04-01
Multivariate pattern analysis (MVPA) or brain decoding methods have become standard practice in analyzing fMRI data. Although decoding methods have been extensively applied in brain-computer interfaces, these methods have only recently been applied to time series neuroimaging data such as MEG and EEG to address experimental questions in cognitive neuroscience. In a tutorial style review, we describe a broad set of options to inform future time series decoding studies from a cognitive neuroscience perspective. Using example MEG data, we illustrate the effects that different options in the decoding analysis pipeline can have on experimental results where the aim is to "decode" different perceptual stimuli or cognitive states over time from dynamic brain activation patterns. We show that decisions made at both preprocessing (e.g., dimensionality reduction, subsampling, trial averaging) and decoding (e.g., classifier selection, cross-validation design) stages of the analysis can significantly affect the results. In addition to standard decoding, we describe extensions to MVPA for time-varying neuroimaging data including representational similarity analysis, temporal generalization, and the interpretation of classifier weight maps. Finally, we outline important caveats in the design and interpretation of time series decoding experiments.
Learning investment indicators through data extension
NASA Astrophysics Data System (ADS)
Dvořák, Marek
2017-07-01
Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin
2017-03-01
The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
NASA Astrophysics Data System (ADS)
Yan, Ying; Zhang, Shen; Tang, Jinjun; Wang, Xiaofei
2017-07-01
Discovering dynamic characteristics in traffic flow is the significant step to design effective traffic managing and controlling strategy for relieving traffic congestion in urban cities. A new method based on complex network theory is proposed to study multivariate traffic flow time series. The data were collected from loop detectors on freeway during a year. In order to construct complex network from original traffic flow, a weighted Froenius norm is adopt to estimate similarity between multivariate time series, and Principal Component Analysis is implemented to determine the weights. We discuss how to select optimal critical threshold for networks at different hour in term of cumulative probability distribution of degree. Furthermore, two statistical properties of networks: normalized network structure entropy and cumulative probability of degree, are utilized to explore hourly variation in traffic flow. The results demonstrate these two statistical quantities express similar pattern to traffic flow parameters with morning and evening peak hours. Accordingly, we detect three traffic states: trough, peak and transitional hours, according to the correlation between two aforementioned properties. The classifying results of states can actually represent hourly fluctuation in traffic flow by analyzing annual average hourly values of traffic volume, occupancy and speed in corresponding hours.
Data imputation analysis for Cosmic Rays time series
NASA Astrophysics Data System (ADS)
Fernandes, R. C.; Lucio, P. S.; Fernandez, J. H.
2017-05-01
The occurrence of missing data concerning Galactic Cosmic Rays time series (GCR) is inevitable since loss of data is due to mechanical and human failure or technical problems and different periods of operation of GCR stations. The aim of this study was to perform multiple dataset imputation in order to depict the observational dataset. The study has used the monthly time series of GCR Climax (CLMX) and Roma (ROME) from 1960 to 2004 to simulate scenarios of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% and 90% of missing data compared to observed ROME series, with 50 replicates. Then, the CLMX station as a proxy for allocation of these scenarios was used. Three different methods for monthly dataset imputation were selected: AMÉLIA II - runs the bootstrap Expectation Maximization algorithm, MICE - runs an algorithm via Multivariate Imputation by Chained Equations and MTSDI - an Expectation Maximization algorithm-based method for imputation of missing values in multivariate normal time series. The synthetic time series compared with the observed ROME series has also been evaluated using several skill measures as such as RMSE, NRMSE, Agreement Index, R, R2, F-test and t-test. The results showed that for CLMX and ROME, the R2 and R statistics were equal to 0.98 and 0.96, respectively. It was observed that increases in the number of gaps generate loss of quality of the time series. Data imputation was more efficient with MTSDI method, with negligible errors and best skill coefficients. The results suggest a limit of about 60% of missing data for imputation, for monthly averages, no more than this. It is noteworthy that CLMX, ROME and KIEL stations present no missing data in the target period. This methodology allowed reconstructing 43 time series.
Visibility Graph Based Time Series Analysis.
Stephen, Mutua; Gu, Changgui; Yang, Huijie
2015-01-01
Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it's microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq) and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks.
Faithfulness of Recurrence Plots: A Mathematical Proof
NASA Astrophysics Data System (ADS)
Hirata, Yoshito; Komuro, Motomasa; Horai, Shunsuke; Aihara, Kazuyuki
It is practically known that a recurrence plot, a two-dimensional visualization of time series data, can contain almost all information related to the underlying dynamics except for its spatial scale because we can recover a rough shape for the original time series from the recurrence plot even if the original time series is multivariate. We here provide a mathematical proof that the metric defined by a recurrence plot [Hirata et al., 2008] is equivalent to the Euclidean metric under mild conditions.
Exploring the Dynamics of Dyadic Interactions via Hierarchical Segmentation
ERIC Educational Resources Information Center
Hsieh, Fushing; Ferrer, Emilio; Chen, Shu-Chun; Chow, Sy-Miin
2010-01-01
In this article we present an exploratory tool for extracting systematic patterns from multivariate data. The technique, hierarchical segmentation (HS), can be used to group multivariate time series into segments with similar discrete-state recurrence patterns and it is not restricted by the stationarity assumption. We use a simulation study to…
Bayesian Estimation of Random Coefficient Dynamic Factor Models
ERIC Educational Resources Information Center
Song, Hairong; Ferrer, Emilio
2012-01-01
Dynamic factor models (DFMs) have typically been applied to multivariate time series data collected from a single unit of study, such as a single individual or dyad. The goal of DFMs application is to capture dynamics of multivariate systems. When multiple units are available, however, DFMs are not suited to capture variations in dynamics across…
Prediction of mortality rates using a model with stochastic parameters
NASA Astrophysics Data System (ADS)
Tan, Chon Sern; Pooi, Ah Hin
2016-10-01
Prediction of future mortality rates is crucial to insurance companies because they face longevity risks while providing retirement benefits to a population whose life expectancy is increasing. In the past literature, a time series model based on multivariate power-normal distribution has been applied on mortality data from the United States for the years 1933 till 2000 to forecast the future mortality rates for the years 2001 till 2010. In this paper, a more dynamic approach based on the multivariate time series will be proposed where the model uses stochastic parameters that vary with time. The resulting prediction intervals obtained using the model with stochastic parameters perform better because apart from having good ability in covering the observed future mortality rates, they also tend to have distinctly shorter interval lengths.
Visibility Graph Based Time Series Analysis
Stephen, Mutua; Gu, Changgui; Yang, Huijie
2015-01-01
Network based time series analysis has made considerable achievements in the recent years. By mapping mono/multivariate time series into networks, one can investigate both it’s microscopic and macroscopic behaviors. However, most proposed approaches lead to the construction of static networks consequently providing limited information on evolutionary behaviors. In the present paper we propose a method called visibility graph based time series analysis, in which series segments are mapped to visibility graphs as being descriptions of the corresponding states and the successively occurring states are linked. This procedure converts a time series to a temporal network and at the same time a network of networks. Findings from empirical records for stock markets in USA (S&P500 and Nasdaq) and artificial series generated by means of fractional Gaussian motions show that the method can provide us rich information benefiting short-term and long-term predictions. Theoretically, we propose a method to investigate time series from the viewpoint of network of networks. PMID:26571115
Modeling multivariate time series on manifolds with skew radial basis functions.
Jamshidi, Arta A; Kirby, Michael J
2011-01-01
We present an approach for constructing nonlinear empirical mappings from high-dimensional domains to multivariate ranges. We employ radial basis functions and skew radial basis functions for constructing a model using data that are potentially scattered or sparse. The algorithm progresses iteratively, adding a new function at each step to refine the model. The placement of the functions is driven by a statistical hypothesis test that accounts for correlation in the multivariate range variables. The test is applied on training and validation data and reveals nonstatistical or geometric structure when it fails. At each step, the added function is fit to data contained in a spatiotemporally defined local region to determine the parameters--in particular, the scale of the local model. The scale of the function is determined by the zero crossings of the autocorrelation function of the residuals. The model parameters and the number of basis functions are determined automatically from the given data, and there is no need to initialize any ad hoc parameters save for the selection of the skew radial basis functions. Compactly supported skew radial basis functions are employed to improve model accuracy, order, and convergence properties. The extension of the algorithm to higher-dimensional ranges produces reduced-order models by exploiting the existence of correlation in the range variable data. Structure is tested not just in a single time series but between all pairs of time series. We illustrate the new methodologies using several illustrative problems, including modeling data on manifolds and the prediction of chaotic time series.
Monograph on the use of the multivariate Gram Charlier series Type A
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hatayodom, T.; Heydt, G.
1978-01-01
The Gram-Charlier series in an infinite series expansion for a probability density function (pdf) in which terms of the series are Hermite polynomials. There are several Gram-Charlier series - the best known is Type A. The Gram-Charlier series, Type A (GCA) exists for both univariate and multivariate random variables. This monograph introduces the multivariate GCA and illustrates its use through several examples. A brief bibliography and discussion of Hermite polynomials is also included. 9 figures, 2 tables.
Optimizing Functional Network Representation of Multivariate Time Series
NASA Astrophysics Data System (ADS)
Zanin, Massimiliano; Sousa, Pedro; Papo, David; Bajo, Ricardo; García-Prieto, Juan; Pozo, Francisco Del; Menasalvas, Ernestina; Boccaletti, Stefano
2012-09-01
By combining complex network theory and data mining techniques, we provide objective criteria for optimization of the functional network representation of generic multivariate time series. In particular, we propose a method for the principled selection of the threshold value for functional network reconstruction from raw data, and for proper identification of the network's indicators that unveil the most discriminative information on the system for classification purposes. We illustrate our method by analysing networks of functional brain activity of healthy subjects, and patients suffering from Mild Cognitive Impairment, an intermediate stage between the expected cognitive decline of normal aging and the more pronounced decline of dementia. We discuss extensions of the scope of the proposed methodology to network engineering purposes, and to other data mining tasks.
Optimizing Functional Network Representation of Multivariate Time Series
Zanin, Massimiliano; Sousa, Pedro; Papo, David; Bajo, Ricardo; García-Prieto, Juan; Pozo, Francisco del; Menasalvas, Ernestina; Boccaletti, Stefano
2012-01-01
By combining complex network theory and data mining techniques, we provide objective criteria for optimization of the functional network representation of generic multivariate time series. In particular, we propose a method for the principled selection of the threshold value for functional network reconstruction from raw data, and for proper identification of the network's indicators that unveil the most discriminative information on the system for classification purposes. We illustrate our method by analysing networks of functional brain activity of healthy subjects, and patients suffering from Mild Cognitive Impairment, an intermediate stage between the expected cognitive decline of normal aging and the more pronounced decline of dementia. We discuss extensions of the scope of the proposed methodology to network engineering purposes, and to other data mining tasks. PMID:22953051
Estimating the decomposition of predictive information in multivariate systems
NASA Astrophysics Data System (ADS)
Faes, Luca; Kugiumtzis, Dimitris; Nollo, Giandomenico; Jurysta, Fabrice; Marinazzo, Daniele
2015-03-01
In the study of complex systems from observed multivariate time series, insight into the evolution of one system may be under investigation, which can be explained by the information storage of the system and the information transfer from other interacting systems. We present a framework for the model-free estimation of information storage and information transfer computed as the terms composing the predictive information about the target of a multivariate dynamical process. The approach tackles the curse of dimensionality employing a nonuniform embedding scheme that selects progressively, among the past components of the multivariate process, only those that contribute most, in terms of conditional mutual information, to the present target process. Moreover, it computes all information-theoretic quantities using a nearest-neighbor technique designed to compensate the bias due to the different dimensionality of individual entropy terms. The resulting estimators of prediction entropy, storage entropy, transfer entropy, and partial transfer entropy are tested on simulations of coupled linear stochastic and nonlinear deterministic dynamic processes, demonstrating the superiority of the proposed approach over the traditional estimators based on uniform embedding. The framework is then applied to multivariate physiologic time series, resulting in physiologically well-interpretable information decompositions of cardiovascular and cardiorespiratory interactions during head-up tilt and of joint brain-heart dynamics during sleep.
Functional MRI and Multivariate Autoregressive Models
Rogers, Baxter P.; Katwal, Santosh B.; Morgan, Victoria L.; Asplund, Christopher L.; Gore, John C.
2010-01-01
Connectivity refers to the relationships that exist between different regions of the brain. In the context of functional magnetic resonance imaging (fMRI), it implies a quantifiable relationship between hemodynamic signals from different regions. One aspect of this relationship is the existence of small timing differences in the signals in different regions. Delays of 100 ms or less may be measured with fMRI, and these may reflect important aspects of the manner in which brain circuits respond as well as the overall functional organization of the brain. The multivariate autoregressive time series model has features to recommend it for measuring these delays, and is straightforward to apply to hemodynamic data. In this review, we describe the current usage of the multivariate autoregressive model for fMRI, discuss the issues that arise when it is applied to hemodynamic time series, and consider several extensions. Connectivity measures like Granger causality that are based on the autoregressive model do not always reflect true neuronal connectivity; however, we conclude that careful experimental design could make this methodology quite useful in extending the information obtainable using fMRI. PMID:20444566
Hot spots of multivariate extreme anomalies in Earth observations
NASA Astrophysics Data System (ADS)
Flach, M.; Sippel, S.; Bodesheim, P.; Brenning, A.; Denzler, J.; Gans, F.; Guanche, Y.; Reichstein, M.; Rodner, E.; Mahecha, M. D.
2016-12-01
Anomalies in Earth observations might indicate data quality issues, extremes or the change of underlying processes within a highly multivariate system. Thus, considering the multivariate constellation of variables for extreme detection yields crucial additional information over conventional univariate approaches. We highlight areas in which multivariate extreme anomalies are more likely to occur, i.e. hot spots of extremes in global atmospheric Earth observations that impact the Biosphere. In addition, we present the year of the most unusual multivariate extreme between 2001 and 2013 and show that these coincide with well known high impact extremes. Technically speaking, we account for multivariate extremes by using three sophisticated algorithms adapted from computer science applications. Namely an ensemble of the k-nearest neighbours mean distance, a kernel density estimation and an approach based on recurrences is used. However, the impact of atmosphere extremes on the Biosphere might largely depend on what is considered to be normal, i.e. the shape of the mean seasonal cycle and its inter-annual variability. We identify regions with similar mean seasonality by means of dimensionality reduction in order to estimate in each region both the `normal' variance and robust thresholds for detecting the extremes. In addition, we account for challenges like heteroscedasticity in Northern latitudes. Apart from hot spot areas, those anomalies in the atmosphere time series are of particular interest, which can only be detected by a multivariate approach but not by a simple univariate approach. Such an anomalous constellation of atmosphere variables is of interest if it impacts the Biosphere. The multivariate constellation of such an anomalous part of a time series is shown in one case study indicating that multivariate anomaly detection can provide novel insights into Earth observations.
Flamm, Christoph; Graef, Andreas; Pirker, Susanne; Baumgartner, Christoph; Deistler, Manfred
2013-01-01
Granger causality is a useful concept for studying causal relations in networks. However, numerical problems occur when applying the corresponding methodology to high-dimensional time series showing co-movement, e.g. EEG recordings or economic data. In order to deal with these shortcomings, we propose a novel method for the causal analysis of such multivariate time series based on Granger causality and factor models. We present the theoretical background, successfully assess our methodology with the help of simulated data and show a potential application in EEG analysis of epileptic seizures. PMID:23354014
Dynamic Factor Analysis Models with Time-Varying Parameters
ERIC Educational Resources Information Center
Chow, Sy-Miin; Zu, Jiyun; Shifren, Kim; Zhang, Guangjian
2011-01-01
Dynamic factor analysis models with time-varying parameters offer a valuable tool for evaluating multivariate time series data with time-varying dynamics and/or measurement properties. We use the Dynamic Model of Activation proposed by Zautra and colleagues (Zautra, Potter, & Reich, 1997) as a motivating example to construct a dynamic factor…
Delay differential analysis of time series.
Lainscsek, Claudia; Sejnowski, Terrence J
2015-03-01
Nonlinear dynamical system analysis based on embedding theory has been used for modeling and prediction, but it also has applications to signal detection and classification of time series. An embedding creates a multidimensional geometrical object from a single time series. Traditionally either delay or derivative embeddings have been used. The delay embedding is composed of delayed versions of the signal, and the derivative embedding is composed of successive derivatives of the signal. The delay embedding has been extended to nonuniform embeddings to take multiple timescales into account. Both embeddings provide information on the underlying dynamical system without having direct access to all the system variables. Delay differential analysis is based on functional embeddings, a combination of the derivative embedding with nonuniform delay embeddings. Small delay differential equation (DDE) models that best represent relevant dynamic features of time series data are selected from a pool of candidate models for detection or classification. We show that the properties of DDEs support spectral analysis in the time domain where nonlinear correlation functions are used to detect frequencies, frequency and phase couplings, and bispectra. These can be efficiently computed with short time windows and are robust to noise. For frequency analysis, this framework is a multivariate extension of discrete Fourier transform (DFT), and for higher-order spectra, it is a linear and multivariate alternative to multidimensional fast Fourier transform of multidimensional correlations. This method can be applied to short or sparse time series and can be extended to cross-trial and cross-channel spectra if multiple short data segments of the same experiment are available. Together, this time-domain toolbox provides higher temporal resolution, increased frequency and phase coupling information, and it allows an easy and straightforward implementation of higher-order spectra across time compared with frequency-based methods such as the DFT and cross-spectral analysis.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry. PMID:28459872
Inferring phase equations from multivariate time series.
Tokuda, Isao T; Jain, Swati; Kiss, István Z; Hudson, John L
2007-08-10
An approach is presented for extracting phase equations from multivariate time series data recorded from a network of weakly coupled limit cycle oscillators. Our aim is to estimate important properties of the phase equations including natural frequencies and interaction functions between the oscillators. Our approach requires the measurement of an experimental observable of the oscillators; in contrast with previous methods it does not require measurements in isolated single or two-oscillator setups. This noninvasive technique can be advantageous in biological systems, where extraction of few oscillators may be a difficult task. The method is most efficient when data are taken from the nonsynchronized regime. Applicability to experimental systems is demonstrated by using a network of electrochemical oscillators; the obtained phase model is utilized to predict the synchronization diagram of the system.
Cabrieto, Jedelyn; Tuerlinckx, Francis; Kuppens, Peter; Hunyadi, Borbála; Ceulemans, Eva
2018-01-15
Detecting abrupt correlation changes in multivariate time series is crucial in many application fields such as signal processing, functional neuroimaging, climate studies, and financial analysis. To detect such changes, several promising correlation change tests exist, but they may suffer from severe loss of power when there is actually more than one change point underlying the data. To deal with this drawback, we propose a permutation based significance test for Kernel Change Point (KCP) detection on the running correlations. Given a requested number of change points K, KCP divides the time series into K + 1 phases by minimizing the within-phase variance. The new permutation test looks at how the average within-phase variance decreases when K increases and compares this to the results for permuted data. The results of an extensive simulation study and applications to several real data sets show that, depending on the setting, the new test performs either at par or better than the state-of-the art significance tests for detecting the presence of correlation changes, implying that its use can be generally recommended.
Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang
2014-10-01
Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Havlicek, Martin; Jan, Jiri; Brazdil, Milan; Calhoun, Vince D.
2015-01-01
Increasing interest in understanding dynamic interactions of brain neural networks leads to formulation of sophisticated connectivity analysis methods. Recent studies have applied Granger causality based on standard multivariate autoregressive (MAR) modeling to assess the brain connectivity. Nevertheless, one important flaw of this commonly proposed method is that it requires the analyzed time series to be stationary, whereas such assumption is mostly violated due to the weakly nonstationary nature of functional magnetic resonance imaging (fMRI) time series. Therefore, we propose an approach to dynamic Granger causality in the frequency domain for evaluating functional network connectivity in fMRI data. The effectiveness and robustness of the dynamic approach was significantly improved by combining a forward and backward Kalman filter that improved estimates compared to the standard time-invariant MAR modeling. In our method, the functional networks were first detected by independent component analysis (ICA), a computational method for separating a multivariate signal into maximally independent components. Then the measure of Granger causality was evaluated using generalized partial directed coherence that is suitable for bivariate as well as multivariate data. Moreover, this metric provides identification of causal relation in frequency domain, which allows one to distinguish the frequency components related to the experimental paradigm. The procedure of evaluating Granger causality via dynamic MAR was demonstrated on simulated time series as well as on two sets of group fMRI data collected during an auditory sensorimotor (SM) or auditory oddball discrimination (AOD) tasks. Finally, a comparison with the results obtained from a standard time-invariant MAR model was provided. PMID:20561919
A Multilevel Multiset Time-Series Model for Describing Complex Developmental Processes
Ma, Xin; Shen, Jianping
2017-01-01
The authors sought to develop an analytical platform where multiple sets of time series can be examined simultaneously. This multivariate platform capable of testing interaction effects among multiple sets of time series can be very useful in empirical research. The authors demonstrated that the multilevel framework can readily accommodate this analytical capacity. Given their intention to use the multilevel multiset time-series model to pursue complicated research purposes, their resulting model is relatively simple to specify, to run, and to interpret. These advantages make the adoption of their model relatively effortless as long as researchers have the basic knowledge and skills in working with multilevel growth modeling. With multiple potential extensions of their model, the establishment of this analytical platform for analysis of multiple sets of time series can inspire researchers to pursue far more advanced research designs to address complex developmental processes in reality. PMID:29881094
Multidimensional stock network analysis: An Escoufier's RV coefficient approach
NASA Astrophysics Data System (ADS)
Lee, Gan Siew; Djauhari, Maman A.
2013-09-01
The current practice of stocks network analysis is based on the assumption that the time series of closed stock price could represent the behaviour of the each stock. This assumption leads to consider minimal spanning tree (MST) and sub-dominant ultrametric (SDU) as an indispensible tool to filter the economic information contained in the network. Recently, there is an attempt where researchers represent stock not only as a univariate time series of closed price but as a bivariate time series of closed price and volume. In this case, they developed the so-called multidimensional MST to filter the important economic information. However, in this paper, we show that their approach is only applicable for that bivariate time series only. This leads us to introduce a new methodology to construct MST where each stock is represented by a multivariate time series. An example of Malaysian stock exchange will be presented and discussed to illustrate the advantages of the method.
1991-09-01
However, there is no guarantee that this would work; for instance if the data were generated by an ARCH model (Tong, 1990 pp. 116-117) then a simple...Hill, R., Griffiths, W., Lutkepohl, H., and Lee, T., Introduction to the Theory and Practice of Econometrics , 2th ed., Wiley, 1985. Kendall, M., Stuart
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Hallac, David; Vare, Sagar; Boyd, Stephen; Leskovec, Jure
2018-01-01
Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios. PMID:29770257
A KST framework for correlation network construction from time series signals
NASA Astrophysics Data System (ADS)
Qi, Jin-Peng; Gu, Quan; Zhu, Ying; Zhang, Ping
2018-04-01
A KST (Kolmogorov-Smirnov test and T statistic) method is used for construction of a correlation network based on the fluctuation of each time series within the multivariate time signals. In this method, each time series is divided equally into multiple segments, and the maximal data fluctuation in each segment is calculated by a KST change detection procedure. Connections between each time series are derived from the data fluctuation matrix, and are used for construction of the fluctuation correlation network (FCN). The method was tested with synthetic simulations and the result was compared with those from using KS or T only for detection of data fluctuation. The novelty of this study is that the correlation analyses was based on the data fluctuation in each segment of each time series rather than on the original time signals, which would be more meaningful for many real world applications and for analysis of large-scale time signals where prior knowledge is uncertain.
A new method for reconstruction of solar irradiance
NASA Astrophysics Data System (ADS)
Privalsky, Victor
2018-07-01
The purpose of this research is to show how time series should be reconstructed using an example with the data on total solar irradiation (TSI) of the Earth and on sunspot numbers (SSN) since 1749. The traditional approach through regression equation(s) is designed for time-invariant vectors of random variables and is not applicable to time series, which present random functions of time. The autoregressive reconstruction (ARR) method suggested here requires fitting a multivariate stochastic difference equation to the target/proxy time series. The reconstruction is done through the scalar equation for the target time series with the white noise term excluded. The time series approach is shown to provide a better reconstruction of TSI than the correlation/regression method. A reconstruction criterion is introduced which allows one to define in advance the achievable level of success in the reconstruction. The conclusion is that time series, including the total solar irradiance, cannot be reconstructed properly if the data are not treated as sample records of random processes and analyzed in both time and frequency domains.
Dimension reduction of frequency-based direct Granger causality measures on short time series.
Siggiridou, Elsa; Kimiskidis, Vasilios K; Kugiumtzis, Dimitris
2017-09-01
The mainstream in the estimation of effective brain connectivity relies on Granger causality measures in the frequency domain. If the measure is meant to capture direct causal effects accounting for the presence of other observed variables, as in multi-channel electroencephalograms (EEG), typically the fit of a vector autoregressive (VAR) model on the multivariate time series is required. For short time series of many variables, the estimation of VAR may not be stable requiring dimension reduction resulting in restricted or sparse VAR models. The restricted VAR obtained by the modified backward-in-time selection method (mBTS) is adapted to the generalized partial directed coherence (GPDC), termed restricted GPDC (RGPDC). Dimension reduction on other frequency based measures, such the direct directed transfer function (dDTF), is straightforward. First, a simulation study using linear stochastic multivariate systems is conducted and RGPDC is favorably compared to GPDC on short time series in terms of sensitivity and specificity. Then the two measures are tested for their ability to detect changes in brain connectivity during an epileptiform discharge (ED) from multi-channel scalp EEG. It is shown that RGPDC identifies better than GPDC the connectivity structure of the simulated systems, as well as changes in the brain connectivity, and is less dependent on the free parameter of VAR order. The proposed dimension reduction in frequency measures based on VAR constitutes an appropriate strategy to estimate reliably brain networks within short-time windows. Copyright © 2017 Elsevier B.V. All rights reserved.
Highlights from the previous volumes
NASA Astrophysics Data System (ADS)
Tong, Liu; al., Hadjihoseini Ali et; Jörg David, J.; al., Gao Zhong-Ke et; et al.
2018-01-01
Superconductivity at 7.3 K in quasi--one-dimensional RbCr3As3Rogue waves as negative entropy events durationsBiological rhythms ---What sets their amplitude?Reconstructing multi-mode networks from multivariate time series
Simulation Exploration through Immersive Parallel Planes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunhart-Lupo, Nicholas J; Bush, Brian W; Gruchalla, Kenny M
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
Simulation Exploration through Immersive Parallel Planes: Preprint
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny
We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Interpretable Early Classification of Multivariate Time Series
ERIC Educational Resources Information Center
Ghalwash, Mohamed F.
2013-01-01
Recent advances in technology have led to an explosion in data collection over time rather than in a single snapshot. For example, microarray technology allows us to measure gene expression levels in different conditions over time. Such temporal data grants the opportunity for data miners to develop algorithms to address domain-related problems,…
Fast and Flexible Multivariate Time Series Subsequence Search
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Oza, Nikunj C.; Zhu, Qiang; Srivastava, Ashok N.
2010-01-01
Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical monitoring, and financial systems. Domain experts are often interested in searching for interesting multivariate patterns from these MTS databases which often contain several gigabytes of data. Surprisingly, research on MTS search is very limited. Most of the existing work only supports queries with the same length of data, or queries on a fixed set of variables. In this paper, we propose an efficient and flexible subsequence search framework for massive MTS databases, that, for the first time, enables querying on any subset of variables with arbitrary time delays between them. We propose two algorithms to solve this problem (1) a List Based Search (LBS) algorithm which uses sorted lists for indexing, and (2) a R*-tree Based Search (RBS) which uses Minimum Bounding Rectangles (MBR) to organize the subsequences. Both algorithms guarantee that all matching patterns within the specified thresholds will be returned (no false dismissals). The very few false alarms can be removed by a post-processing step. Since our framework is also capable of Univariate Time-Series (UTS) subsequence search, we first demonstrate the efficiency of our algorithms on several UTS datasets previously used in the literature. We follow this up with experiments using two large MTS databases from the aviation domain, each containing several millions of observations. Both these tests show that our algorithms have very high prune rates (>99%) thus needing actual disk access for only less than 1% of the observations. To the best of our knowledge, MTS subsequence search has never been attempted on datasets of the size we have used in this paper.
A first application of independent component analysis to extracting structure from stock returns.
Back, A D; Weigend, A S
1997-08-01
This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).
Process fault detection and nonlinear time series analysis for anomaly detection in safeguards
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burr, T.L.; Mullen, M.F.; Wangen, L.E.
In this paper we discuss two advanced techniques, process fault detection and nonlinear time series analysis, and apply them to the analysis of vector-valued and single-valued time-series data. We investigate model-based process fault detection methods for analyzing simulated, multivariate, time-series data from a three-tank system. The model-predictions are compared with simulated measurements of the same variables to form residual vectors that are tested for the presence of faults (possible diversions in safeguards terminology). We evaluate two methods, testing all individual residuals with a univariate z-score and testing all variables simultaneously with the Mahalanobis distance, for their ability to detect lossmore » of material from two different leak scenarios from the three-tank system: a leak without and with replacement of the lost volume. Nonlinear time-series analysis tools were compared with the linear methods popularized by Box and Jenkins. We compare prediction results using three nonlinear and two linear modeling methods on each of six simulated time series: two nonlinear and four linear. The nonlinear methods performed better at predicting the nonlinear time series and did as well as the linear methods at predicting the linear values.« less
A time domain frequency-selective multivariate Granger causality approach.
Leistritz, Lutz; Witte, Herbert
2016-08-01
The investigation of effective connectivity is one of the major topics in computational neuroscience to understand the interaction between spatially distributed neuronal units of the brain. Thus, a wide variety of methods has been developed during the last decades to investigate functional and effective connectivity in multivariate systems. Their spectrum ranges from model-based to model-free approaches with a clear separation into time and frequency range methods. We present in this simulation study a novel time domain approach based on Granger's principle of predictability, which allows frequency-selective considerations of directed interactions. It is based on a comparison of prediction errors of multivariate autoregressive models fitted to systematically modified time series. These modifications are based on signal decompositions, which enable a targeted cancellation of specific signal components with specific spectral properties. Depending on the embedded signal decomposition method, a frequency-selective or data-driven signal-adaptive Granger Causality Index may be derived.
Deconvolution of mixing time series on a graph
Blocker, Alexander W.; Airoldi, Edoardo M.
2013-01-01
In many applications we are interested in making inference on latent time series from indirect measurements, which are often low-dimensional projections resulting from mixing or aggregation. Positron emission tomography, super-resolution, and network traffic monitoring are some examples. Inference in such settings requires solving a sequence of ill-posed inverse problems, yt = Axt, where the projection mechanism provides information on A. We consider problems in which A specifies mixing on a graph of times series that are bursty and sparse. We develop a multilevel state-space model for mixing times series and an efficient approach to inference. A simple model is used to calibrate regularization parameters that lead to efficient inference in the multilevel state-space model. We apply this method to the problem of estimating point-to-point traffic flows on a network from aggregate measurements. Our solution outperforms existing methods for this problem, and our two-stage approach suggests an efficient inference strategy for multilevel models of multivariate time series. PMID:25309135
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012
Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau
2016-01-01
Background The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. Methods In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). Results The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Conclusion Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis. PMID:26901682
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012.
Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau
2016-01-01
The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis.
[Multivariate Adaptive Regression Splines (MARS), an alternative for the analysis of time series].
Vanegas, Jairo; Vásquez, Fabián
Multivariate Adaptive Regression Splines (MARS) is a non-parametric modelling method that extends the linear model, incorporating nonlinearities and interactions between variables. It is a flexible tool that automates the construction of predictive models: selecting relevant variables, transforming the predictor variables, processing missing values and preventing overshooting using a self-test. It is also able to predict, taking into account structural factors that might influence the outcome variable, thereby generating hypothetical models. The end result could identify relevant cut-off points in data series. It is rarely used in health, so it is proposed as a tool for the evaluation of relevant public health indicators. For demonstrative purposes, data series regarding the mortality of children under 5 years of age in Costa Rica were used, comprising the period 1978-2008. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Testing for Granger Causality in the Frequency Domain: A Phase Resampling Method.
Liu, Siwei; Molenaar, Peter
2016-01-01
This article introduces phase resampling, an existing but rarely used surrogate data method for making statistical inferences of Granger causality in frequency domain time series analysis. Granger causality testing is essential for establishing causal relations among variables in multivariate dynamic processes. However, testing for Granger causality in the frequency domain is challenging due to the nonlinear relation between frequency domain measures (e.g., partial directed coherence, generalized partial directed coherence) and time domain data. Through a simulation study, we demonstrate that phase resampling is a general and robust method for making statistical inferences even with short time series. With Gaussian data, phase resampling yields satisfactory type I and type II error rates in all but one condition we examine: when a small effect size is combined with an insufficient number of data points. Violations of normality lead to slightly higher error rates but are mostly within acceptable ranges. We illustrate the utility of phase resampling with two empirical examples involving multivariate electroencephalography (EEG) and skin conductance data.
2013-01-01
Background Matching pursuit algorithm (MP), especially with recent multivariate extensions, offers unique advantages in analysis of EEG and MEG. Methods We propose a novel construction of an optimal Gabor dictionary, based upon the metrics introduced in this paper. We implement this construction in a freely available software for MP decomposition of multivariate time series, with a user friendly interface via the Svarog package (Signal Viewer, Analyzer and Recorder On GPL, http://braintech.pl/svarog), and provide a hands-on introduction to its application to EEG. Finally, we describe numerical and mathematical optimizations used in this implementation. Results Optimal Gabor dictionaries, based on the metric introduced in this paper, for the first time allowed for a priori assessment of maximum one-step error of the MP algorithm. Variants of multivariate MP, implemented in the accompanying software, are organized according to the mathematical properties of the algorithms, relevant in the light of EEG/MEG analysis. Some of these variants have been successfully applied to both multichannel and multitrial EEG and MEG in previous studies, improving preprocessing for EEG/MEG inverse solutions and parameterization of evoked potentials in single trials; we mention also ongoing work and possible novel applications. Conclusions Mathematical results presented in this paper improve our understanding of the basics of the MP algorithm. Simple introduction of its properties and advantages, together with the accompanying stable and user-friendly Open Source software package, pave the way for a widespread and reproducible analysis of multivariate EEG and MEG time series and novel applications, while retaining a high degree of compatibility with the traditional, visual analysis of EEG. PMID:24059247
Evaluation of a Multivariate Syndromic Surveillance System for West Nile Virus.
Faverjon, Céline; Andersson, M Gunnar; Decors, Anouk; Tapprest, Jackie; Tritz, Pierre; Sandoz, Alain; Kutasi, Orsolya; Sala, Carole; Leblond, Agnès
2016-06-01
Various methods are currently used for the early detection of West Nile virus (WNV) but their outputs are not quantitative and/or do not take into account all available information. Our study aimed to test a multivariate syndromic surveillance system to evaluate if the sensitivity and the specificity of detection of WNV could be improved. Weekly time series data on nervous syndromes in horses and mortality in both horses and wild birds were used. Baselines were fitted to the three time series and used to simulate 100 years of surveillance data. WNV outbreaks were simulated and inserted into the baselines based on historical data and expert opinion. Univariate and multivariate syndromic surveillance systems were tested to gauge how well they detected the outbreaks; detection was based on an empirical Bayesian approach. The systems' performances were compared using measures of sensitivity, specificity, and area under receiver operating characteristic curve (AUC). When data sources were considered separately (i.e., univariate systems), the best detection performance was obtained using the data set of nervous symptoms in horses compared to those of bird and horse mortality (AUCs equal to 0.80, 0.75, and 0.50, respectively). A multivariate outbreak detection system that used nervous symptoms in horses and bird mortality generated the best performance (AUC = 0.87). The proposed approach is suitable for performing multivariate syndromic surveillance of WNV outbreaks. This is particularly relevant, given that a multivariate surveillance system performed better than a univariate approach. Such a surveillance system could be especially useful in serving as an alert for the possibility of human viral infections. This approach can be also used for other diseases for which multiple sources of evidence are available.
Statistical Evaluation of Time Series Analysis Techniques
NASA Technical Reports Server (NTRS)
Benignus, V. A.
1973-01-01
The performance of a modified version of NASA's multivariate spectrum analysis program is discussed. A multiple regression model was used to make the revisions. Performance improvements were documented and compared to the standard fast Fourier transform by Monte Carlo techniques.
1981-08-01
RATIO TEST STATISTIC FOR SPHERICITY OF COMPLEX MULTIVARIATE NORMAL DISTRIBUTION* C. Fang P. R. Krishnaiah B. N. Nagarsenker** August 1981 Technical...and their applications in time sEries, the reader is referred to Krishnaiah (1976). Motivated by the applications in the area of inference on multiple...for practical purposes. Here, we note that Krishnaiah , Lee and Chang (1976) approxi- mated the null distribution of certain power of the likeli
Kaier, Klaus; Hagist, Christian; Frank, Uwe; Conrad, Andreas; Meyer, Elisabeth
2009-04-01
To determine the impact of antibiotic consumption and alcohol-based hand disinfection on the incidences of nosocomial methicillin-resistant Staphylococcus aureus (MRSA) infection and Clostridium difficile infection (CDI). Two multivariate time-series analyses were performed that used as dependent variables the monthly incidences of nosocomial MRSA infection and CDI at the Freiburg University Medical Center during the period January 2003 through October 2007. The volume of alcohol-based hand rub solution used per month was quantified in liters per 1,000 patient-days. Antibiotic consumption was calculated in terms of the number of defined daily doses per 1,000 patient-days per month. The use of alcohol-based hand rub was found to have a significant impact on the incidence of nosocomial MRSA infection (P< .001). The multivariate analysis (R2=0.66) showed that a higher volume of use of alcohol-based hand rub was associated with a lower incidence of nosocomial MRSA infection. Conversely, a higher level of consumption of selected antimicrobial agents was associated with a higher incidence of nosocomial MRSA infection. This analysis showed this relationship was the same for the use of second-generation cephalosporins (P= .023), third-generation cephalosporins (P= .05), fluoroquinolones (P= .01), and lincosamides (P= .05). The multivariate analysis (R2=0.55) showed that a higher level of consumption of third-generation cephalosporins (P= .008), fluoroquinolones (P= .084), and/or macrolides (P= .007) was associated with a higher incidence of CDI. A correlation with use of alcohol-based hand rub was not detected. In 2 multivariate time-series analyses, we were able to show the impact of hand hygiene and antibiotic use on the incidence of nosocomial MRSA infection, but we found no association between hand hygiene and incidence of CDI.
NASA Astrophysics Data System (ADS)
Chen, Yonghong; Bressler, Steven L.; Knuth, Kevin H.; Truccolo, Wilson A.; Ding, Mingzhou
2006-06-01
In this article we consider the stochastic modeling of neurobiological time series from cognitive experiments. Our starting point is the variable-signal-plus-ongoing-activity model. From this model a differentially variable component analysis strategy is developed from a Bayesian perspective to estimate event-related signals on a single trial basis. After subtracting out the event-related signal from recorded single trial time series, the residual ongoing activity is treated as a piecewise stationary stochastic process and analyzed by an adaptive multivariate autoregressive modeling strategy which yields power, coherence, and Granger causality spectra. Results from applying these methods to local field potential recordings from monkeys performing cognitive tasks are presented.
Mei, Jiangyuan; Liu, Meizhu; Wang, Yuan-Fang; Gao, Huijun
2016-06-01
Multivariate time series (MTS) datasets broadly exist in numerous fields, including health care, multimedia, finance, and biometrics. How to classify MTS accurately has become a hot research topic since it is an important element in many computer vision and pattern recognition applications. In this paper, we propose a Mahalanobis distance-based dynamic time warping (DTW) measure for MTS classification. The Mahalanobis distance builds an accurate relationship between each variable and its corresponding category. It is utilized to calculate the local distance between vectors in MTS. Then we use DTW to align those MTS which are out of synchronization or with different lengths. After that, how to learn an accurate Mahalanobis distance function becomes another key problem. This paper establishes a LogDet divergence-based metric learning with triplet constraint model which can learn Mahalanobis matrix with high precision and robustness. Furthermore, the proposed method is applied on nine MTS datasets selected from the University of California, Irvine machine learning repository and Robert T. Olszewski's homepage, and the results demonstrate the improved performance of the proposed approach.
Le Strat, Yann
2017-01-01
The objective of this paper is to evaluate a panel of statistical algorithms for temporal outbreak detection. Based on a large dataset of simulated weekly surveillance time series, we performed a systematic assessment of 21 statistical algorithms, 19 implemented in the R package surveillance and two other methods. We estimated false positive rate (FPR), probability of detection (POD), probability of detection during the first week, sensitivity, specificity, negative and positive predictive values and F1-measure for each detection method. Then, to identify the factors associated with these performance measures, we ran multivariate Poisson regression models adjusted for the characteristics of the simulated time series (trend, seasonality, dispersion, outbreak sizes, etc.). The FPR ranged from 0.7% to 59.9% and the POD from 43.3% to 88.7%. Some methods had a very high specificity, up to 99.4%, but a low sensitivity. Methods with a high sensitivity (up to 79.5%) had a low specificity. All methods had a high negative predictive value, over 94%, while positive predictive values ranged from 6.5% to 68.4%. Multivariate Poisson regression models showed that performance measures were strongly influenced by the characteristics of time series. Past or current outbreak size and duration strongly influenced detection performances. PMID:28715489
Copula-based prediction of economic movements
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Hirsh, I. D.
2016-06-01
In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.
NASA Astrophysics Data System (ADS)
Donges, Jonathan F.; Heitzig, Jobst; Beronov, Boyan; Wiedermann, Marc; Runge, Jakob; Feng, Qing Yi; Tupikina, Liubov; Stolbova, Veronika; Donner, Reik V.; Marwan, Norbert; Dijkstra, Henk A.; Kurths, Jürgen
2015-11-01
We introduce the pyunicorn (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics, or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis, recurrence networks, visibility graphs, and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology.
Prediction of energy expenditure and physical activity in preschoolers
USDA-ARS?s Scientific Manuscript database
Accurate, nonintrusive, and feasible methods are needed to predict energy expenditure (EE) and physical activity (PA) levels in preschoolers. Herein, we validated cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on accelerometry and heart rate (HR) ...
Automated smoother for the numerical decoupling of dynamics models.
Vilela, Marco; Borges, Carlos C H; Vinga, Susana; Vasconcelos, Ana Tereza R; Santos, Helena; Voit, Eberhard O; Almeida, Jonas S
2007-08-21
Structure identification of dynamic models for complex biological systems is the cornerstone of their reverse engineering. Biochemical Systems Theory (BST) offers a particularly convenient solution because its parameters are kinetic-order coefficients which directly identify the topology of the underlying network of processes. We have previously proposed a numerical decoupling procedure that allows the identification of multivariate dynamic models of complex biological processes. While described here within the context of BST, this procedure has a general applicability to signal extraction. Our original implementation relied on artificial neural networks (ANN), which caused slight, undesirable bias during the smoothing of the time courses. As an alternative, we propose here an adaptation of the Whittaker's smoother and demonstrate its role within a robust, fully automated structure identification procedure. In this report we propose a robust, fully automated solution for signal extraction from time series, which is the prerequisite for the efficient reverse engineering of biological systems models. The Whittaker's smoother is reformulated within the context of information theory and extended by the development of adaptive signal segmentation to account for heterogeneous noise structures. The resulting procedure can be used on arbitrary time series with a nonstationary noise process; it is illustrated here with metabolic profiles obtained from in-vivo NMR experiments. The smoothed solution that is free of parametric bias permits differentiation, which is crucial for the numerical decoupling of systems of differential equations. The method is applicable in signal extraction from time series with nonstationary noise structure and can be applied in the numerical decoupling of system of differential equations into algebraic equations, and thus constitutes a rather general tool for the reverse engineering of mechanistic model descriptions from multivariate experimental time series.
Hopke, P K; Liu, C; Rubin, D B
2001-03-01
Many chemical and environmental data sets are complicated by the existence of fully missing values or censored values known to lie below detection thresholds. For example, week-long samples of airborne particulate matter were obtained at Alert, NWT, Canada, between 1980 and 1991, where some of the concentrations of 24 particulate constituents were coarsened in the sense of being either fully missing or below detection limits. To facilitate scientific analysis, it is appealing to create complete data by filling in missing values so that standard complete-data methods can be applied. We briefly review commonly used strategies for handling missing values and focus on the multiple-imputation approach, which generally leads to valid inferences when faced with missing data. Three statistical models are developed for multiply imputing the missing values of airborne particulate matter. We expect that these models are useful for creating multiple imputations in a variety of incomplete multivariate time series data sets.
Detecting synchronization clusters in multivariate time series via coarse-graining of Markov chains.
Allefeld, Carsten; Bialonski, Stephan
2007-12-01
Synchronization cluster analysis is an approach to the detection of underlying structures in data sets of multivariate time series, starting from a matrix R of bivariate synchronization indices. A previous method utilized the eigenvectors of R for cluster identification, analogous to several recent attempts at group identification using eigenvectors of the correlation matrix. All of these approaches assumed a one-to-one correspondence of dominant eigenvectors and clusters, which has however been shown to be wrong in important cases. We clarify the usefulness of eigenvalue decomposition for synchronization cluster analysis by translating the problem into the language of stochastic processes, and derive an enhanced clustering method harnessing recent insights from the coarse-graining of finite-state Markov processes. We illustrate the operation of our method using a simulated system of coupled Lorenz oscillators, and we demonstrate its superior performance over the previous approach. Finally we investigate the question of robustness of the algorithm against small sample size, which is important with regard to field applications.
Multi-frequency complex network from time series for uncovering oil-water flow structure.
Gao, Zhong-Ke; Yang, Yu-Xuan; Fang, Peng-Cheng; Jin, Ning-De; Xia, Cheng-Yi; Hu, Li-Dan
2015-02-04
Uncovering complex oil-water flow structure represents a challenge in diverse scientific disciplines. This challenge stimulates us to develop a new distributed conductance sensor for measuring local flow signals at different positions and then propose a novel approach based on multi-frequency complex network to uncover the flow structures from experimental multivariate measurements. In particular, based on the Fast Fourier transform, we demonstrate how to derive multi-frequency complex network from multivariate time series. We construct complex networks at different frequencies and then detect community structures. Our results indicate that the community structures faithfully represent the structural features of oil-water flow patterns. Furthermore, we investigate the network statistic at different frequencies for each derived network and find that the frequency clustering coefficient enables to uncover the evolution of flow patterns and yield deep insights into the formation of flow structures. Current results present a first step towards a network visualization of complex flow patterns from a community structure perspective.
NASA Astrophysics Data System (ADS)
Zhu, Zhe
2017-08-01
The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.
ERIC Educational Resources Information Center
Jung, Kwanghee; Takane, Yoshio; Hwang, Heungsun; Woodward, Todd S.
2012-01-01
We propose a new method of structural equation modeling (SEM) for longitudinal and time series data, named Dynamic GSCA (Generalized Structured Component Analysis). The proposed method extends the original GSCA by incorporating a multivariate autoregressive model to account for the dynamic nature of data taken over time. Dynamic GSCA also…
NASA Astrophysics Data System (ADS)
Niedzielski, Tomasz; Kosek, Wiesław
2008-02-01
This article presents the application of a multivariate prediction technique for predicting universal time (UT1-UTC), length of day (LOD) and the axial component of atmospheric angular momentum (AAM χ 3). The multivariate predictions of LOD and UT1-UTC are generated by means of the combination of (1) least-squares (LS) extrapolation of models for annual, semiannual, 18.6-year, 9.3-year oscillations and for the linear trend, and (2) multivariate autoregressive (MAR) stochastic prediction of LS residuals (LS + MAR). The MAR technique enables the use of the AAM χ 3 time-series as the explanatory variable for the computation of LOD or UT1-UTC predictions. In order to evaluate the performance of this approach, two other prediction schemes are also applied: (1) LS extrapolation, (2) combination of LS extrapolation and univariate autoregressive (AR) prediction of LS residuals (LS + AR). The multivariate predictions of AAM χ 3 data, however, are computed as a combination of the extrapolation of the LS model for annual and semiannual oscillations and the LS + MAR. The AAM χ 3 predictions are also compared with LS extrapolation and LS + AR prediction. It is shown that the predictions of LOD and UT1-UTC based on LS + MAR taking into account the axial component of AAM are more accurate than the predictions of LOD and UT1-UTC based on LS extrapolation or on LS + AR. In particular, the UT1-UTC predictions based on LS + MAR during El Niño/La Niña events exhibit considerably smaller prediction errors than those calculated by means of LS or LS + AR. The AAM χ 3 time-series is predicted using LS + MAR with higher accuracy than applying LS extrapolation itself in the case of medium-term predictions (up to 100 days in the future). However, the predictions of AAM χ 3 reveal the best accuracy for LS + AR.
Structural Equation Modeling of Multivariate Time Series
ERIC Educational Resources Information Center
du Toit, Stephen H. C.; Browne, Michael W.
2007-01-01
The covariance structure of a vector autoregressive process with moving average residuals (VARMA) is derived. It differs from other available expressions for the covariance function of a stationary VARMA process and is compatible with current structural equation methodology. Structural equation modeling programs, such as LISREL, may therefore be…
Measures of dependence for multivariate Lévy distributions
NASA Astrophysics Data System (ADS)
Boland, J.; Hurd, T. R.; Pivato, M.; Seco, L.
2001-02-01
Recent statistical analysis of a number of financial databases is summarized. Increasing agreement is found that logarithmic equity returns show a certain type of asymptotic behavior of the largest events, namely that the probability density functions have power law tails with an exponent α≈3.0. This behavior does not vary much over different stock exchanges or over time, despite large variations in trading environments. The present paper proposes a class of multivariate distributions which generalizes the observed qualities of univariate time series. A new consequence of the proposed class is the "spectral measure" which completely characterizes the multivariate dependences of the extreme tails of the distribution. This measure on the unit sphere in M-dimensions, in principle completely general, can be determined empirically by looking at extreme events. If it can be observed and determined, it will prove to be of importance for scenario generation in portfolio risk management.
A Regularized Linear Dynamical System Framework for Multivariate Time Series Analysis.
Liu, Zitao; Hauskrecht, Milos
2015-01-01
Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning Multivariate Time Series (MTS). However, in general, it is difficult to set the dimension of an LDS's hidden state space. A small number of hidden states may not be able to model the complexities of a MTS, while a large number of hidden states can lead to overfitting. In this paper, we study learning methods that impose various regularization penalties on the transition matrix of the LDS model and propose a regularized LDS learning framework (rLDS) which aims to (1) automatically shut down LDSs' spurious and unnecessary dimensions, and consequently, address the problem of choosing the optimal number of hidden states; (2) prevent the overfitting problem given a small amount of MTS data; and (3) support accurate MTS forecasting. To learn the regularized LDS from data we incorporate a second order cone program and a generalized gradient descent method into the Maximum a Posteriori framework and use Expectation Maximization to obtain a low-rank transition matrix of the LDS model. We propose two priors for modeling the matrix which lead to two instances of our rLDS. We show that our rLDS is able to recover well the intrinsic dimensionality of the time series dynamics and it improves the predictive performance when compared to baselines on both synthetic and real-world MTS datasets.
Nonlinear Dynamics, Poor Data, and What to Make of Them?
NASA Astrophysics Data System (ADS)
Ghil, M.; Zaliapin, I. V.
2005-12-01
The analysis of univariate or multivariate time series provides crucial information to describe, understand, and predict variability in the geosciences. The discovery and implementation of a number of novel methods for extracting useful information from time series has recently revitalized this classical field of study. Considerable progress has also been made in interpreting the information so obtained in terms of dynamical systems theory. In this talk we will describe the connections between time series analysis and nonlinear dynamics, discuss signal-to-noise enhancement, and present some of the novel methods for spectral analysis. These fall into two broad categories: (i) methods that try to ferret out regularities of the time series; and (ii) methods aimed at describing the characteristics of irregular processes. The former include singular-spectrum analysis (SSA), the multi-taper method (MTM), and the maximum-entropy method (MEM). The various steps, as well as the advantages and disadvantages of these methods, will be illustrated by their application to several important climatic time series, such as the Southern Oscillation Index (SOI), paleoclimatic time series, and instrumental temperature time series. The SOI index captures major features of interannual climate variability and is used extensively in its prediction. The other time series cover interdecadal and millennial time scales. The second category includes the calculation of fractional dimension, leading Lyapunov exponents, and Hurst exponents. More recently, multi-trend analysis (MTA), binary-decomposition analysis (BDA), and related methods have attempted to describe the structure of time series that include both regular and irregular components. Within the time available, I will try to give a feeling for how these methods work, and how well.
Multivariate singular spectrum analysis and the road to phase synchronization
NASA Astrophysics Data System (ADS)
Groth, Andreas; Ghil, Michael
2010-05-01
Singular spectrum analysis (SSA) and multivariate SSA (M-SSA) are based on the classical work of Kosambi (1943), Loeve (1945) and Karhunen (1946) and are closely related to principal component analysis. They have been introduced into information theory by Bertero, Pike and co-workers (1982, 1984) and into dynamical systems analysis by Broomhead and King (1986a,b). Ghil, Vautard and associates have applied SSA and M-SSA to the temporal and spatio-temporal analysis of short and noisy time series in climate dynamics and other fields in the geosciences since the late 1980s. M-SSA provides insight into the unknown or partially known dynamics of the underlying system by decomposing the delay-coordinate phase space of a given multivariate time series into a set of data-adaptive orthonormal components. These components can be classified essentially into trends, oscillatory patterns and noise, and allow one to reconstruct a robust "skeleton" of the dynamical system's structure. For an overview we refer to Ghil et al. (Rev. Geophys., 2002). In this talk, we present M-SSA in the context of synchronization analysis and illustrate its ability to unveil information about the mechanisms behind the adjustment of rhythms in coupled dynamical systems. The focus of the talk is on the special case of phase synchronization between coupled chaotic oscillators (Rosenblum et al., PRL, 1996). Several ways of measuring phase synchronization are in use, and the robust definition of a reasonable phase for each oscillator is critical in each of them. We illustrate here the advantages of M-SSA in the automatic identification of oscillatory modes and in drawing conclusions about the transition to phase synchronization. Without using any a priori definition of a suitable phase, we show that M-SSA is able to detect phase synchronization in a chain of coupled chaotic oscillators (Osipov et al., PRE, 1996). Recently, Muller et al. (PRE, 2005) and Allefeld et al. (Intl. J. Bif. Chaos, 2007) have demonstrated the usefulness of principal component analysis in detecting phase synchronization from multivariate time series. The present talk provides a generalization of this idea and presents a robust implementation thereof via M-SSA.
Comparative Research of Navy Voluntary Education at Operational Commands
2017-03-01
return on investment, ROI, logistic regression, multivariate analysis, descriptive statistics, Markov, time-series, linear programming 15. NUMBER...21 B. DESCRIPTIVE STATISTICS TABLES ...............................................25 C. PRIVACY CONSIDERATIONS...THIS PAGE INTENTIONALLY LEFT BLANK xi LIST OF TABLES Table 1. Variables and Descriptions . Adapted from NETC (2016). .......................21
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
An Efficient Pattern Mining Approach for Event Detection in Multivariate Temporal Data
Batal, Iyad; Cooper, Gregory; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos
2015-01-01
This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present Recent Temporal Pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the Minimal Predictive Recent Temporal Patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems. PMID:26752800
Interpretation of a compositional time series
NASA Astrophysics Data System (ADS)
Tolosana-Delgado, R.; van den Boogaart, K. G.
2012-04-01
Common methods for multivariate time series analysis use linear operations, from the definition of a time-lagged covariance/correlation to the prediction of new outcomes. However, when the time series response is a composition (a vector of positive components showing the relative importance of a set of parts in a total, like percentages and proportions), then linear operations are afflicted of several problems. For instance, it has been long recognised that (auto/cross-)correlations between raw percentages are spurious, more dependent on which other components are being considered than on any natural link between the components of interest. Also, a long-term forecast of a composition in models with a linear trend will ultimately predict negative components. In general terms, compositional data should not be treated in a raw scale, but after a log-ratio transformation (Aitchison, 1986: The statistical analysis of compositional data. Chapman and Hill). This is so because the information conveyed by a compositional data is relative, as stated in their definition. The principle of working in coordinates allows to apply any sort of multivariate analysis to a log-ratio transformed composition, as long as this transformation is invertible. This principle is of full application to time series analysis. We will discuss how results (both auto/cross-correlation functions and predictions) can be back-transformed, viewed and interpreted in a meaningful way. One view is to use the exhaustive set of all possible pairwise log-ratios, which allows to express the results into D(D - 1)/2 separate, interpretable sets of one-dimensional models showing the behaviour of each possible pairwise log-ratios. Another view is the interpretation of estimated coefficients or correlations back-transformed in terms of compositions. These two views are compatible and complementary. These issues are illustrated with time series of seasonal precipitation patterns at different rain gauges of the USA. In this data set, the proportion of annual precipitation falling in winter, spring, summer and autumn is considered a 4-component time series. Three invertible log-ratios are defined for calculations, balancing rainfall in autumn vs. winter, in summer vs. spring, and in autumn-winter vs. spring-summer. Results suggest a 2-year correlation range, and certain oscillatory behaviour in the last balance, which does not occur in the other two.
Multivariate time series modeling of short-term system scale irrigation demand
NASA Astrophysics Data System (ADS)
Perera, Kushan C.; Western, Andrew W.; George, Biju; Nawarathna, Bandara
2015-12-01
Travel time limits the ability of irrigation system operators to react to short-term irrigation demand fluctuations that result from variations in weather, including very hot periods and rainfall events, as well as the various other pressures and opportunities that farmers face. Short-term system-wide irrigation demand forecasts can assist in system operation. Here we developed a multivariate time series (ARMAX) model to forecast irrigation demands with respect to aggregated service points flows (IDCGi, ASP) and off take regulator flows (IDCGi, OTR) based across 5 command areas, which included area covered under four irrigation channels and the study area. These command area specific ARMAX models forecast 1-5 days ahead daily IDCGi, ASP and IDCGi, OTR using the real time flow data recorded at the service points and the uppermost regulators and observed meteorological data collected from automatic weather stations. The model efficiency and the predictive performance were quantified using the root mean squared error (RMSE), Nash-Sutcliffe model efficiency coefficient (NSE), anomaly correlation coefficient (ACC) and mean square skill score (MSSS). During the evaluation period, NSE for IDCGi, ASP and IDCGi, OTR across 5 command areas were ranged 0.98-0.78. These models were capable of generating skillful forecasts (MSSS ⩾ 0.5 and ACC ⩾ 0.6) of IDCGi, ASP and IDCGi, OTR for all 5 lead days and IDCGi, ASP and IDCGi, OTR forecasts were better than using the long term monthly mean irrigation demand. Overall these predictive performance from the ARMAX time series models were higher than almost all the previous studies we are aware. Further, IDCGi, ASP and IDCGi, OTR forecasts have improved the operators' ability to react for near future irrigation demand fluctuations as the developed ARMAX time series models were self-adaptive to reflect the short-term changes in the irrigation demand with respect to various pressures and opportunities that farmers' face, such as changing water policy, continued development of water markets, drought and changing technology.
Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains
Krumin, Michael; Shoham, Shy
2010-01-01
Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705
ERIC Educational Resources Information Center
Mistler, Stephen A.; Enders, Craig K.
2017-01-01
Multiple imputation methods can generally be divided into two broad frameworks: joint model (JM) imputation and fully conditional specification (FCS) imputation. JM draws missing values simultaneously for all incomplete variables using a multivariate distribution, whereas FCS imputes variables one at a time from a series of univariate conditional…
Evaluation of statistical protocols for quality control of ecosystem carbon dioxide fluxes
Jorge F. Perez-Quezada; Nicanor Z. Saliendra; William E. Emmerich; Emilio A. Laca
2007-01-01
The process of quality control of micrometeorological and carbon dioxide (CO2) flux data can be subjective and may lack repeatability, which would undermine the results of many studies. Multivariate statistical methods and time series analysis were used together and independently to detect and replace outliers in CO2 flux...
ERIC Educational Resources Information Center
Hamaker, Ellen L.; Dolan, Conor V.; Molenaar, Peter C. M.
2005-01-01
Results obtained with interindividual techniques in a representative sample of a population are not necessarily generalizable to the individual members of this population. In this article the specific condition is presented that must be satisfied to generalize from the interindividual level to the intraindividual level. A way to investigate…
Whomersley, P; Schratzberger, M; Huxham, M; Bates, H; Rees, H
2007-01-01
Sewage sludge was disposed of in Liverpool Bay for over 100 years. Annual amounts increased from 0.5 million tonnes per annum in 1900 to approximately 2 million tonnes per annum by 1995. Macrofauna and a suite of environmental variables were collected at a station adjacent to, and a reference station distant from, the disposal site over 13 years, spanning a pre- (1990-1998) and post- (1999-2003) cessation period. Univariate and multivariate analyses of the time-series data showed significant community differences between reference and disposal site stations and multivariate analyses revealed station-specific community development post-disposal. Temporal variability of communities collected at the disposal station post-cessation was higher than during years of disposal, when temporally stable dominance patterns of disturbance-tolerant species had established. Alterations of community structure post-disturbance reflected successional changes possibly driven by facilitation. Subtle faunistic changes at the Liverpool Bay disposal site indicate that the near-field effects of the disposal of sewage sludge were small and therefore could be considered environmentally acceptable.
Grigoryeva, Lyudmila; Henriques, Julie; Larger, Laurent; Ortega, Juan-Pablo
2014-07-01
Reservoir computing is a recently introduced machine learning paradigm that has already shown excellent performances in the processing of empirical data. We study a particular kind of reservoir computers called time-delay reservoirs that are constructed out of the sampling of the solution of a time-delay differential equation and show their good performance in the forecasting of the conditional covariances associated to multivariate discrete-time nonlinear stochastic processes of VEC-GARCH type as well as in the prediction of factual daily market realized volatilities computed with intraday quotes, using as training input daily log-return series of moderate size. We tackle some problems associated to the lack of task-universality for individually operating reservoirs and propose a solution based on the use of parallel arrays of time-delay reservoirs. Copyright © 2014 Elsevier Ltd. All rights reserved.
Rio, Daniel E.; Rawlings, Robert R.; Woltz, Lawrence A.; Gilman, Jodi; Hommer, Daniel W.
2013-01-01
A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function. PMID:23840281
Rio, Daniel E; Rawlings, Robert R; Woltz, Lawrence A; Gilman, Jodi; Hommer, Daniel W
2013-01-01
A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function.
Kaier, K; Meyer, E; Dettenkofer, M; Frank, U
2010-10-01
Two multivariate time-series analyses were carried out to identify the impact of bed occupancy rates, turnover intervals and the average length of hospital stay on the spread of multidrug-resistant bacteria in a teaching hospital. Epidemiological data on the incidences of meticillin-resistant Staphylococcus aureus (MRSA) and extended-spectrum beta-lactamase (ESBL)-producing bacteria were collected. Time-series of bed occupancy rates, turnover intervals and the average length of stay were tested for inclusion in the models as independent variables. Incidence was defined as nosocomial cases per 1000 patient-days. This included all patients infected or colonised with MRSA/ESBL more than 48h after admission. Between January 2003 and July 2008, a mean incidence of 0.15 nosocomial MRSA cases was identified. ESBL was not included in the surveillance until January 2005. Between January 2005 and July 2008 the mean incidence of nosocomial ESBL was also 0.15 cases per 1000 patient-days. The two multivariate models demonstrate a temporal relationship between bed occupancy rates in general wards and the incidence of nosocomial MRSA and ESBL. Similarly, the temporal relationship between the monthly average length of stay in intensive care units (ICUs) and the incidence of nosocomial MRSA and ESBL was demonstrated. Overcrowding in general wards and long periods of ICU stay were identified as factors influencing the spread of multidrug-resistant bacteria in hospital settings. Copyright 2010 The Hospital Infection Society. Published by Elsevier Ltd. All rights reserved.
Moorman, J. Randall; Delos, John B.; Flower, Abigail A.; Cao, Hanqing; Kovatchev, Boris P.; Richman, Joshua S.; Lake, Douglas E.
2014-01-01
We have applied principles of statistical signal processing and non-linear dynamics to analyze heart rate time series from premature newborn infants in order to assist in the early diagnosis of sepsis, a common and potentially deadly bacterial infection of the bloodstream. We began with the observation of reduced variability and transient decelerations in heart rate interval time series for hours up to days prior to clinical signs of illness. We find that measurements of standard deviation, sample asymmetry and sample entropy are highly related to imminent clinical illness. We developed multivariable statistical predictive models, and an interface to display the real-time results to clinicians. Using this approach, we have observed numerous cases in which incipient neonatal sepsis was diagnosed and treated without any clinical illness at all. This review focuses on the mathematical and statistical time series approaches used to detect these abnormal heart rate characteristics and present predictive monitoring information to the clinician. PMID:22026974
Multivariate survivorship analysis using two cross-sectional samples.
Hill, M E
1999-11-01
As an alternative to survival analysis with longitudinal data, I introduce a method that can be applied when one observes the same cohort in two cross-sectional samples collected at different points in time. The method allows for the estimation of log-probability survivorship models that estimate the influence of multiple time-invariant factors on survival over a time interval separating two samples. This approach can be used whenever the survival process can be adequately conceptualized as an irreversible single-decrement process (e.g., mortality, the transition to first marriage among a cohort of never-married individuals). Using data from the Integrated Public Use Microdata Series (Ruggles and Sobek 1997), I illustrate the multivariate method through an investigation of the effects of race, parity, and educational attainment on the survival of older women in the United States.
Multiscale analysis of information dynamics for linear multivariate processes.
Faes, Luca; Montalto, Alessandro; Stramaglia, Sebastiano; Nollo, Giandomenico; Marinazzo, Daniele
2016-08-01
In the study of complex physical and physiological systems represented by multivariate time series, an issue of great interest is the description of the system dynamics over a range of different temporal scales. While information-theoretic approaches to the multiscale analysis of complex dynamics are being increasingly used, the theoretical properties of the applied measures are poorly understood. This study introduces for the first time a framework for the analytical computation of information dynamics for linear multivariate stochastic processes explored at different time scales. After showing that the multiscale processing of a vector autoregressive (VAR) process introduces a moving average (MA) component, we describe how to represent the resulting VARMA process using statespace (SS) models and how to exploit the SS model parameters to compute analytical measures of information storage and information transfer for the original and rescaled processes. The framework is then used to quantify multiscale information dynamics for simulated unidirectionally and bidirectionally coupled VAR processes, showing that rescaling may lead to insightful patterns of information storage and transfer but also to potentially misleading behaviors.
Steed, Chad A.; Halsey, William; Dehoff, Ryan; ...
2017-02-16
Flexible visual analysis of long, high-resolution, and irregularly sampled time series data from multiple sensor streams is a challenge in several domains. In the field of additive manufacturing, this capability is critical for realizing the full potential of large-scale 3D printers. Here, we propose a visual analytics approach that helps additive manufacturing researchers acquire a deep understanding of patterns in log and imagery data collected by 3D printers. Our specific goals include discovering patterns related to defects and system performance issues, optimizing build configurations to avoid defects, and increasing production efficiency. We introduce Falcon, a new visual analytics system thatmore » allows users to interactively explore large, time-oriented data sets from multiple linked perspectives. Falcon provides overviews, detailed views, and unique segmented time series visualizations, all with adjustable scale options. To illustrate the effectiveness of Falcon at providing thorough and efficient knowledge discovery, we present a practical case study involving experts in additive manufacturing and data from a large-scale 3D printer. The techniques described are applicable to the analysis of any quantitative time series, though the focus of this paper is on additive manufacturing.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steed, Chad A.; Halsey, William; Dehoff, Ryan
Flexible visual analysis of long, high-resolution, and irregularly sampled time series data from multiple sensor streams is a challenge in several domains. In the field of additive manufacturing, this capability is critical for realizing the full potential of large-scale 3D printers. Here, we propose a visual analytics approach that helps additive manufacturing researchers acquire a deep understanding of patterns in log and imagery data collected by 3D printers. Our specific goals include discovering patterns related to defects and system performance issues, optimizing build configurations to avoid defects, and increasing production efficiency. We introduce Falcon, a new visual analytics system thatmore » allows users to interactively explore large, time-oriented data sets from multiple linked perspectives. Falcon provides overviews, detailed views, and unique segmented time series visualizations, all with adjustable scale options. To illustrate the effectiveness of Falcon at providing thorough and efficient knowledge discovery, we present a practical case study involving experts in additive manufacturing and data from a large-scale 3D printer. The techniques described are applicable to the analysis of any quantitative time series, though the focus of this paper is on additive manufacturing.« less
Bayesian wavelet PCA methodology for turbomachinery damage diagnosis under uncertainty
NASA Astrophysics Data System (ADS)
Xu, Shengli; Jiang, Xiaomo; Huang, Jinzhi; Yang, Shuhua; Wang, Xiaofang
2016-12-01
Centrifugal compressor often suffers various defects such as impeller cracking, resulting in forced outage of the total plant. Damage diagnostics and condition monitoring of such a turbomachinery system has become an increasingly important and powerful tool to prevent potential failure in components and reduce unplanned forced outage and further maintenance costs, while improving reliability, availability and maintainability of a turbomachinery system. This paper presents a probabilistic signal processing methodology for damage diagnostics using multiple time history data collected from different locations of a turbomachine, considering data uncertainty and multivariate correlation. The proposed methodology is based on the integration of three advanced state-of-the-art data mining techniques: discrete wavelet packet transform, Bayesian hypothesis testing, and probabilistic principal component analysis. The multiresolution wavelet analysis approach is employed to decompose a time series signal into different levels of wavelet coefficients. These coefficients represent multiple time-frequency resolutions of a signal. Bayesian hypothesis testing is then applied to each level of wavelet coefficient to remove possible imperfections. The ratio of posterior odds Bayesian approach provides a direct means to assess whether there is imperfection in the decomposed coefficients, thus avoiding over-denoising. Power spectral density estimated by the Welch method is utilized to evaluate the effectiveness of Bayesian wavelet cleansing method. Furthermore, the probabilistic principal component analysis approach is developed to reduce dimensionality of multiple time series and to address multivariate correlation and data uncertainty for damage diagnostics. The proposed methodology and generalized framework is demonstrated with a set of sensor data collected from a real-world centrifugal compressor with impeller cracks, through both time series and contour analyses of vibration signal and principal components.
Application of multivariate autoregressive spectrum estimation to ULF waves
NASA Technical Reports Server (NTRS)
Ioannidis, G. A.
1975-01-01
The estimation of the power spectrum of a time series by fitting a finite autoregressive model to the data has recently found widespread application in the physical sciences. The extension of this method to the analysis of vector time series is presented here through its application to ULF waves observed in the magnetosphere by the ATS 6 synchronous satellite. Autoregressive spectral estimates of the power and cross-power spectra of these waves are computed with computer programs developed by the author and are compared with the corresponding Blackman-Tukey spectral estimates. The resulting spectral density matrices are then analyzed to determine the direction of propagation and polarization of the observed waves.
Alternative method to validate the seasonal land cover regions of the conterminous United States
Zhiliang Zhu; Donald O. Ohlen; Raymond L. Czaplewski; Robert E. Burgan
1996-01-01
An accuracy assessment method involving double sampling and the multivariate composite estimator has been used to validate the prototype seasonal land cover characteristics database of the conterminous United States. The database consists of 159 land cover classes, classified using time series of 1990 1-km satellite data and augmented with ancillary data including...
The Recoverability of P-Technique Factor Analysis
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
2009-01-01
It seems that just when we are about to lay P-technique factor analysis finally to rest as obsolete because of newer, more sophisticated multivariate time-series models using latent variables--dynamic factor models--it rears its head to inform us that an obituary may be premature. We present the results of some simulations demonstrating that even…
High Resolution Time Series Observations of Bio-Optical and Physical Variability in the Arabian Sea
1998-09-30
1995-October 20, 1995). Multi-variable moored systems ( MVMS ) were deployed by our group at 35 and 80m. The MVMS utilizes a VMCM to measure currents...similar to that of the UCSB MVMSs. WORK COMPLETED Our MVMS interdisciplinary systems with sampling intervals of a few minutes were placed on a mooring
USDA-ARS?s Scientific Manuscript database
Prediction equations of energy expenditure (EE) using accelerometers and miniaturized heart rate (HR) monitors have been developed in older children and adults but not in preschool-aged children. Because the relationships between accelerometer counts (ACs), HR, and EE are confounded by growth and ma...
NASA Astrophysics Data System (ADS)
Donges, Jonathan; Heitzig, Jobst; Beronov, Boyan; Wiedermann, Marc; Runge, Jakob; Feng, Qing Yi; Tupikina, Liubov; Stolbova, Veronika; Donner, Reik; Marwan, Norbert; Dijkstra, Henk; Kurths, Jürgen
2016-04-01
We introduce the pyunicorn (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics, or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis, recurrence networks, visibility graphs, and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology. pyunicorn is available online at https://github.com/pik-copan/pyunicorn. Reference: J.F. Donges, J. Heitzig, B. Beronov, M. Wiedermann, J. Runge, Q.-Y. Feng, L. Tupikina, V. Stolbova, R.V. Donner, N. Marwan, H.A. Dijkstra, and J. Kurths, Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package, Chaos 25, 113101 (2015), DOI: 10.1063/1.4934554, Preprint: arxiv.org:1507.01571 [physics.data-an].
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Huang, Xin; Zeng, Jun; Zhou, Lina; Hu, Chunxiu; Yin, Peiyuan; Lin, Xiaohui
2016-08-31
Time-series metabolomics studies can provide insight into the dynamics of disease development and facilitate the discovery of prospective biomarkers. To improve the performance of early risk identification, a new strategy for analyzing time-series data based on dynamic networks (ATSD-DN) in a systematic time dimension is proposed. In ATSD-DN, the non-overlapping ratio was applied to measure the changes in feature ratios during the process of disease development and to construct dynamic networks. Dynamic concentration analysis and network topological structure analysis were performed to extract early warning information. This strategy was applied to the study of time-series lipidomics data from a stepwise hepatocarcinogenesis rat model. A ratio of lyso-phosphatidylcholine (LPC) 18:1/free fatty acid (FFA) 20:5 was identified as the potential biomarker for hepatocellular carcinoma (HCC). It can be used to classify HCC and non-HCC rats, and the area under the curve values in the discovery and external validation sets were 0.980 and 0.972, respectively. This strategy was also compared with a weighted relative difference accumulation algorithm (wRDA), multivariate empirical Bayes statistics (MEBA) and support vector machine-recursive feature elimination (SVM-RFE). The better performance of ATSD-DN suggests its potential for a more complete presentation of time-series changes and effective extraction of early warning information.
NASA Astrophysics Data System (ADS)
Huang, Xin; Zeng, Jun; Zhou, Lina; Hu, Chunxiu; Yin, Peiyuan; Lin, Xiaohui
2016-08-01
Time-series metabolomics studies can provide insight into the dynamics of disease development and facilitate the discovery of prospective biomarkers. To improve the performance of early risk identification, a new strategy for analyzing time-series data based on dynamic networks (ATSD-DN) in a systematic time dimension is proposed. In ATSD-DN, the non-overlapping ratio was applied to measure the changes in feature ratios during the process of disease development and to construct dynamic networks. Dynamic concentration analysis and network topological structure analysis were performed to extract early warning information. This strategy was applied to the study of time-series lipidomics data from a stepwise hepatocarcinogenesis rat model. A ratio of lyso-phosphatidylcholine (LPC) 18:1/free fatty acid (FFA) 20:5 was identified as the potential biomarker for hepatocellular carcinoma (HCC). It can be used to classify HCC and non-HCC rats, and the area under the curve values in the discovery and external validation sets were 0.980 and 0.972, respectively. This strategy was also compared with a weighted relative difference accumulation algorithm (wRDA), multivariate empirical Bayes statistics (MEBA) and support vector machine-recursive feature elimination (SVM-RFE). The better performance of ATSD-DN suggests its potential for a more complete presentation of time-series changes and effective extraction of early warning information.
Tuarob, Suppawong; Tucker, Conrad S; Kumara, Soundar; Giles, C Lee; Pincus, Aaron L; Conroy, David E; Ram, Nilam
2017-04-01
It is believed that anomalous mental states such as stress and anxiety not only cause suffering for the individuals, but also lead to tragedies in some extreme cases. The ability to predict the mental state of an individual at both current and future time periods could prove critical to healthcare practitioners. Currently, the practical way to predict an individual's mental state is through mental examinations that involve psychological experts performing the evaluations. However, such methods can be time and resource consuming, mitigating their broad applicability to a wide population. Furthermore, some individuals may also be unaware of their mental states or may feel uncomfortable to express themselves during the evaluations. Hence, their anomalous mental states could remain undetected for a prolonged period of time. The objective of this work is to demonstrate the ability of using advanced machine learning based approaches to generate mathematical models that predict current and future mental states of an individual. The problem of mental state prediction is transformed into the time series forecasting problem, where an individual is represented as a multivariate time series stream of monitored physical and behavioral attributes. A personalized mathematical model is then automatically generated to capture the dependencies among these attributes, which is used for prediction of mental states for each individual. In particular, we first illustrate the drawbacks of traditional multivariate time series forecasting methodologies such as vector autoregression. Then, we show that such issues could be mitigated by using machine learning regression techniques which are modified for capturing temporal dependencies in time series data. A case study using the data from 150 human participants illustrates that the proposed machine learning based forecasting methods are more suitable for high-dimensional psychological data than the traditional vector autoregressive model in terms of both magnitude of error and directional accuracy. These results not only present a successful usage of machine learning techniques in psychological studies, but also serve as a building block for multiple medical applications that could rely on an automated system to gauge individuals' mental states. Copyright © 2017 Elsevier Inc. All rights reserved.
Giassi, Pedro; Okida, Sergio; Oliveira, Maurício G; Moraes, Raimes
2013-11-01
Short-term cardiovascular regulation mediated by the sympathetic and parasympathetic branches of the autonomic nervous system has been investigated by multivariate autoregressive (MVAR) modeling, providing insightful analysis. MVAR models employ, as inputs, heart rate (HR), systolic blood pressure (SBP) and respiratory waveforms. ECG (from which HR series is obtained) and respiratory flow waveform (RFW) can be easily sampled from the patients. Nevertheless, the available methods for acquisition of beat-to-beat SBP measurements during exams hamper the wider use of MVAR models in clinical research. Recent studies show an inverse correlation between pulse wave transit time (PWTT) series and SBP fluctuations. PWTT is the time interval between the ECG R-wave peak and photoplethysmography waveform (PPG) base point within the same cardiac cycle. This study investigates the feasibility of using inverse PWTT (IPWTT) series as an alternative input to SBP for MVAR modeling of the cardiovascular regulation. For that, HR, RFW, and IPWTT series acquired from volunteers during postural changes and autonomic blockade were used as input of MVAR models. Obtained results show that IPWTT series can be used as input of MVAR models, replacing SBP measurements in order to overcome practical difficulties related to the continuous sampling of the SBP during clinical exams.
NASA Astrophysics Data System (ADS)
Ronsmans, Gaétane; Wespes, Catherine; Hurtmans, Daniel; Clerbaux, Cathy; Coheur, Pierre-François
2018-04-01
This study aims to understand the spatial and temporal variability of HNO3 total columns in terms of explanatory variables. To achieve this, multiple linear regressions are used to fit satellite-derived time series of HNO3 daily averaged total columns. First, an analysis of the IASI 9-year time series (2008-2016) is conducted based on various equivalent latitude bands. The strong and systematic denitrification of the southern polar stratosphere is observed very clearly. It is also possible to distinguish, within the polar vortex, three regions which are differently affected by the denitrification. Three exceptional denitrification episodes in 2011, 2014 and 2016 are also observed in the Northern Hemisphere, due to unusually low arctic temperatures. The time series are then fitted by multivariate regressions to identify what variables are responsible for HNO3 variability in global distributions and time series, and to quantify their respective influence. Out of an ensemble of proxies (annual cycle, solar flux, quasi-biennial oscillation, multivariate ENSO index, Arctic and Antarctic oscillations and volume of polar stratospheric clouds), only the those defined as significant (p value < 0.05) by a selection algorithm are retained for each equivalent latitude band. Overall, the regression gives a good representation of HNO3 variability, with especially good results at high latitudes (60-80 % of the observed variability explained by the model). The regressions show the dominance of annual variability in all latitudinal bands, which is related to specific chemistry and dynamics depending on the latitudes. We find that the polar stratospheric clouds (PSCs) also have a major influence in the polar regions, and that their inclusion in the model improves the correlation coefficients and the residuals. However, there is still a relatively large portion of HNO3 variability that remains unexplained by the model, especially in the intertropical regions, where factors not included in the regression model (such as vegetation fires or lightning) may be at play.
NASA Astrophysics Data System (ADS)
Rogowitz, Bernice E.; Rabenhorst, David A.; Gerth, John A.; Kalin, Edward B.
1996-04-01
This paper describes a set of visual techniques, based on principles of human perception and cognition, which can help users analyze and develop intuitions about tabular data. Collections of tabular data are widely available, including, for example, multivariate time series data, customer satisfaction data, stock market performance data, multivariate profiles of companies and individuals, and scientific measurements. In our approach, we show how visual cues can help users perform a number of data mining tasks, including identifying correlations and interaction effects, finding clusters and understanding the semantics of cluster membership, identifying anomalies and outliers, and discovering multivariate relationships among variables. These cues are derived from psychological studies on perceptual organization, visual search, perceptual scaling, and color perception. These visual techniques are presented as a complement to the statistical and algorithmic methods more commonly associated with these tasks, and provide an interactive interface for the human analyst.
Multivariate frequency domain analysis of protein dynamics
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori
2009-03-01
Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.
NASA Astrophysics Data System (ADS)
Azami, Hamed; Escudero, Javier
2017-01-01
Multiscale entropy (MSE) is an appealing tool to characterize the complexity of time series over multiple temporal scales. Recent developments in the field have tried to extend the MSE technique in different ways. Building on these trends, we propose the so-called refined composite multivariate multiscale fuzzy entropy (RCmvMFE) whose coarse-graining step uses variance (RCmvMFEσ2) or mean (RCmvMFEμ). We investigate the behavior of these multivariate methods on multichannel white Gaussian and 1/ f noise signals, and two publicly available biomedical recordings. Our simulations demonstrate that RCmvMFEσ2 and RCmvMFEμ lead to more stable results and are less sensitive to the signals' length in comparison with the other existing multivariate multiscale entropy-based methods. The classification results also show that using both the variance and mean in the coarse-graining step offers complexity profiles with complementary information for biomedical signal analysis. We also made freely available all the Matlab codes used in this paper.
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
NASA Astrophysics Data System (ADS)
Aoyama, Hideaki; Fujiwara, Yoshi; Ikeda, Yuichi; Iyetomi, Hiroshi; Souma, Wataru; Yoshikawa, Hiroshi
2017-07-01
Preface; Foreword, Acknowledgements, List of tables; List of figures, prologue, 1. Introduction: reconstructing macroeconomics; 2. Basic concepts in statistical physics and stochastic models; 3. Income and firm-size distributions; 4. Productivity distribution and related topics; 5. Multivariate time-series analysis; 6. Business cycles; 7. Price dynamics and inflation/deflation; 8. Complex network, community analysis, visualization; 9. Systemic risks; Appendix A: computer program for beginners; Epilogue; Bibliography; Index.
Contamination Event Detection with Multivariate Time-Series Data in Agricultural Water Monitoring †
Mao, Yingchi; Qi, Hai; Ping, Ping; Li, Xiaofang
2017-01-01
Time series data of multiple water quality parameters are obtained from the water sensor networks deployed in the agricultural water supply network. The accurate and efficient detection and warning of contamination events to prevent pollution from spreading is one of the most important issues when pollution occurs. In order to comprehensively reduce the event detection deviation, a spatial–temporal-based event detection approach with multivariate time-series data for water quality monitoring (M-STED) was proposed. The M-STED approach includes three parts. The first part is that M-STED adopts a Rule K algorithm to select backbone nodes as the nodes in the CDS, and forward the sensed data of multiple water parameters. The second part is to determine the state of each backbone node with back propagation neural network models and the sequential Bayesian analysis in the current timestamp. The third part is to establish a spatial model with Bayesian networks to estimate the state of the backbones in the next timestamp and trace the “outlier” node to its neighborhoods to detect a contamination event. The experimental results indicate that the average detection rate is more than 80% with M-STED and the false detection rate is lower than 9%, respectively. The M-STED approach can improve the rate of detection by about 40% and reduce the false alarm rate by about 45%, compared with the event detection with a single water parameter algorithm, S-STED. Moreover, the proposed M-STED can exhibit better performance in terms of detection delay and scalability. PMID:29207535
Predictive modeling of EEG time series for evaluating surgery targets in epilepsy patients.
Steimer, Andreas; Müller, Michael; Schindler, Kaspar
2017-05-01
During the last 20 years, predictive modeling in epilepsy research has largely been concerned with the prediction of seizure events, whereas the inference of effective brain targets for resective surgery has received surprisingly little attention. In this exploratory pilot study, we describe a distributional clustering framework for the modeling of multivariate time series and use it to predict the effects of brain surgery in epilepsy patients. By analyzing the intracranial EEG, we demonstrate how patients who became seizure free after surgery are clearly distinguished from those who did not. More specifically, for 5 out of 7 patients who obtained seizure freedom (= Engel class I) our method predicts the specific collection of brain areas that got actually resected during surgery to yield a markedly lower posterior probability for the seizure related clusters, when compared to the resection of random or empty collections. Conversely, for 4 out of 5 Engel class III/IV patients who still suffer from postsurgical seizures, performance of the actually resected collection is not significantly better than performances displayed by random or empty collections. As the number of possible collections ranges into billions and more, this is a substantial contribution to a problem that today is still solved by visual EEG inspection. Apart from epilepsy research, our clustering methodology is also of general interest for the analysis of multivariate time series and as a generative model for temporally evolving functional networks in the neurosciences and beyond. Hum Brain Mapp 38:2509-2531, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Faes, Luca; Nollo, Giandomenico; Krohova, Jana; Czippelova, Barbora; Turianikova, Zuzana; Javorka, Michal
2017-07-01
To fully elucidate the complex physiological mechanisms underlying the short-term autonomic regulation of heart period (H), systolic and diastolic arterial pressure (S, D) and respiratory (R) variability, the joint dynamics of these variables need to be explored using multivariate time series analysis. This study proposes the utilization of information-theoretic measures to measure causal interactions between nodes of the cardiovascular/cardiorespiratory network and to assess the nature (synergistic or redundant) of these directed interactions. Indexes of information transfer and information modification are extracted from the H, S, D and R series measured from healthy subjects in a resting state and during postural stress. Computations are performed in the framework of multivariate linear regression, using bootstrap techniques to assess on a single-subject basis the statistical significance of each measure and of its transitions across conditions. We find patterns of information transfer and modification which are related to specific cardiovascular and cardiorespiratory mechanisms in resting conditions and to their modification induced by the orthostatic stress.
Ramdani, Sofiane; Bonnet, Vincent; Tallon, Guillaume; Lagarde, Julien; Bernard, Pierre Louis; Blain, Hubert
2016-08-01
Entropy measures are often used to quantify the regularity of postural sway time series. Recent methodological developments provided both multivariate and multiscale approaches allowing the extraction of complexity features from physiological signals; see "Dynamical complexity of human responses: A multivariate data-adaptive framework," in Bulletin of Polish Academy of Science and Technology, vol. 60, p. 433, 2012. The resulting entropy measures are good candidates for the analysis of bivariate postural sway signals exhibiting nonstationarity and multiscale properties. These methods are dependant on several input parameters such as embedding parameters. Using two data sets collected from institutionalized frail older adults, we numerically investigate the behavior of a recent multivariate and multiscale entropy estimator; see "Multivariate multiscale entropy: A tool for complexity analysis of multichannel data," Physics Review E, vol. 84, p. 061918, 2011. We propose criteria for the selection of the input parameters. Using these optimal parameters, we statistically compare the multivariate and multiscale entropy values of postural sway data of non-faller subjects to those of fallers. These two groups are discriminated by the resulting measures over multiple time scales. We also demonstrate that the typical parameter settings proposed in the literature lead to entropy measures that do not distinguish the two groups. This last result confirms the importance of the selection of appropriate input parameters.
NASA Astrophysics Data System (ADS)
Jia, Xiaoliang; An, Haizhong; Sun, Xiaoqi; Huang, Xuan; Gao, Xiangyun
2016-04-01
The globalization and regionalization of crude oil trade inevitably give rise to the difference of crude oil prices. The understanding of the pattern of the crude oil prices' mutual propagation is essential for analyzing the development of global oil trade. Previous research has focused mainly on the fuzzy long- or short-term one-to-one propagation of bivariate oil prices, generally ignoring various patterns of periodical multivariate propagation. This study presents a wavelet-based network approach to help uncover the multipath propagation of multivariable crude oil prices in a joint time-frequency period. The weekly oil spot prices of the OPEC member states from June 1999 to March 2011 are adopted as the sample data. First, we used wavelet analysis to find different subseries based on an optimal decomposing scale to describe the periodical feature of the original oil price time series. Second, a complex network model was constructed based on an optimal threshold selection to describe the structural feature of multivariable oil prices. Third, Bayesian network analysis (BNA) was conducted to find the probability causal relationship based on periodical structural features to describe the various patterns of periodical multivariable propagation. Finally, the significance of the leading and intermediary oil prices is discussed. These findings are beneficial for the implementation of periodical target-oriented pricing policies and investment strategies.
Tormene, Paolo; Giorgino, Toni; Quaglini, Silvana; Stefanelli, Mario
2009-01-01
The purpose of this study was to assess the performance of a real-time ("open-end") version of the dynamic time warping (DTW) algorithm for the recognition of motor exercises. Given a possibly incomplete input stream of data and a reference time series, the open-end DTW algorithm computes both the size of the prefix of reference which is best matched by the input, and the dissimilarity between the matched portions. The algorithm was used to provide real-time feedback to neurological patients undergoing motor rehabilitation. We acquired a dataset of multivariate time series from a sensorized long-sleeve shirt which contains 29 strain sensors distributed on the upper limb. Seven typical rehabilitation exercises were recorded in several variations, both correctly and incorrectly executed, and at various speeds, totaling a data set of 840 time series. Nearest-neighbour classifiers were built according to the outputs of open-end DTW alignments and their global counterparts on exercise pairs. The classifiers were also tested on well-known public datasets from heterogeneous domains. Nonparametric tests show that (1) on full time series the two algorithms achieve the same classification accuracy (p-value =0.32); (2) on partial time series, classifiers based on open-end DTW have a far higher accuracy (kappa=0.898 versus kappa=0.447;p<10(-5)); and (3) the prediction of the matched fraction follows closely the ground truth (root mean square <10%). The results hold for the motor rehabilitation and the other datasets tested, as well. The open-end variant of the DTW algorithm is suitable for the classification of truncated quantitative time series, even in the presence of noise. Early recognition and accurate class prediction can be achieved, provided that enough variance is available over the time span of the reference. Therefore, the proposed technique expands the use of DTW to a wider range of applications, such as real-time biofeedback systems.
Oczeretko, Edward; Swiatecka, Jolanta; Kitlas, Agnieszka; Laudanski, Tadeusz; Pierzynski, Piotr
2006-01-01
In physiological research, we often study multivariate data sets, containing two or more simultaneously recorded time series. The aim of this paper is to present the cross-correlation and the wavelet cross-correlation methods to assess synchronization between contractions in different topographic regions of the uterus. From a medical point of view, it is important to identify time delays between contractions, which may be of potential diagnostic significance in various pathologies. The cross-correlation was computed in a moving window with a width corresponding to approximately two or three contractions. As a result, the running cross-correlation function was obtained. The propagation% parameter assessed from this function allows quantitative description of synchronization in bivariate time series. In general, the uterine contraction signals are very complicated. Wavelet transforms provide insight into the structure of the time series at various frequencies (scales). To show the changes of the propagation% parameter along scales, a wavelet running cross-correlation was used. At first, the continuous wavelet transforms as the uterine contraction signals were received and afterwards, a running cross-correlation analysis was conducted for each pair of transformed time series. The findings show that running functions are very useful in the analysis of uterine contractions.
NASA Astrophysics Data System (ADS)
Ferrera, Elisabetta; Giammanco, Salvatore; Cannata, Andrea; Montalto, Placido
2013-04-01
From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol® probe located on the upper NE flank of Mt. Etna volcano, close either to the Piano Provenzana fault or to the NE-Rift. Seismic and volcanological data have been analyzed together with radon data. We also analyzed air and soil temperature, barometric pressure, snow and rain fall data. In order to find possible correlations among the above parameters, and hence to reveal possible anomalies in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-days time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-days moving averages showed that, similar to multivariate linear regression analysis, the summer period is characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allows to study the relations among different signals either in time or frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Our work suggests that in order to make an accurate analysis of the relations among distinct signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be very effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Fast Multivariate Search on Large Aviation Datasets
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Zhu, Qiang; Oza, Nikunj C.; Srivastava, Ashok N.
2010-01-01
Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical monitoring, and financial systems. Domain experts are often interested in searching for interesting multivariate patterns from these MTS databases which can contain up to several gigabytes of data. Surprisingly, research on MTS search is very limited. Most existing work only supports queries with the same length of data, or queries on a fixed set of variables. In this paper, we propose an efficient and flexible subsequence search framework for massive MTS databases, that, for the first time, enables querying on any subset of variables with arbitrary time delays between them. We propose two provably correct algorithms to solve this problem (1) an R-tree Based Search (RBS) which uses Minimum Bounding Rectangles (MBR) to organize the subsequences, and (2) a List Based Search (LBS) algorithm which uses sorted lists for indexing. We demonstrate the performance of these algorithms using two large MTS databases from the aviation domain, each containing several millions of observations Both these tests show that our algorithms have very high prune rates (>95%) thus needing actual
HydroClimATe: hydrologic and climatic analysis toolkit
Dickinson, Jesse; Hanson, Randall T.; Predmore, Steven K.
2014-01-01
The potential consequences of climate variability and climate change have been identified as major issues for the sustainability and availability of the worldwide water resources. Unlike global climate change, climate variability represents deviations from the long-term state of the climate over periods of a few years to several decades. Currently, rich hydrologic time-series data are available, but the combination of data preparation and statistical methods developed by the U.S. Geological Survey as part of the Groundwater Resources Program is relatively unavailable to hydrologists and engineers who could benefit from estimates of climate variability and its effects on periodic recharge and water-resource availability. This report documents HydroClimATe, a computer program for assessing the relations between variable climatic and hydrologic time-series data. HydroClimATe was developed for a Windows operating system. The software includes statistical tools for (1) time-series preprocessing, (2) spectral analysis, (3) spatial and temporal analysis, (4) correlation analysis, and (5) projections. The time-series preprocessing tools include spline fitting, standardization using a normal or gamma distribution, and transformation by a cumulative departure. The spectral analysis tools include discrete Fourier transform, maximum entropy method, and singular spectrum analysis. The spatial and temporal analysis tool is empirical orthogonal function analysis. The correlation analysis tools are linear regression and lag correlation. The projection tools include autoregressive time-series modeling and generation of many realizations. These tools are demonstrated in four examples that use stream-flow discharge data, groundwater-level records, gridded time series of precipitation data, and the Multivariate ENSO Index.
Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis.
Chiba, Tomoaki; Hino, Hideitsu; Akaho, Shotaro; Murata, Noboru
2017-01-01
In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group's sales beat GM's sales, which is a reasonable scenario.
Time-Varying Transition Probability Matrix Estimation and Its Application to Brand Share Analysis
Chiba, Tomoaki; Akaho, Shotaro; Murata, Noboru
2017-01-01
In a product market or stock market, different products or stocks compete for the same consumers or purchasers. We propose a method to estimate the time-varying transition matrix of the product share using a multivariate time series of the product share. The method is based on the assumption that each of the observed time series of shares is a stationary distribution of the underlying Markov processes characterized by transition probability matrices. We estimate transition probability matrices for every observation under natural assumptions. We demonstrate, on a real-world dataset of the share of automobiles, that the proposed method can find intrinsic transition of shares. The resulting transition matrices reveal interesting phenomena, for example, the change in flows between TOYOTA group and GM group for the fiscal year where TOYOTA group’s sales beat GM’s sales, which is a reasonable scenario. PMID:28076383
Kelava, Augustin; Muma, Michael; Deja, Marlene; Dagdagan, Jack Y.; Zoubir, Abdelhak M.
2015-01-01
Emotion eliciting situations are accompanied by changes of multiple variables associated with subjective, physiological and behavioral responses. The quantification of the overall simultaneous synchrony of psychophysiological reactions plays a major role in emotion theories and has received increased attention in recent years. From a psychometric perspective, the reactions represent multivariate non-stationary intra-individual time series. In this paper, a new time-frequency based latent variable approach for the quantification of the synchrony of the responses is presented. The approach is applied to empirical data, collected during an emotion eliciting situation. The results are compared with a complementary inter-individual approach of Hsieh et al. (2011). Finally, the proposed approach is discussed in the context of emotion theories, and possible future applications and limitations are provided. PMID:25653624
1984-10-26
test for independence; ons i ser, -, of the poduct life estimator; dependent risks; 119 ASRACT Coniinue on ’wme-se f nereiary-~and iaen r~f> by Worst...the failure times associated with different failure - modes when we really should use a bivariate (or multivariate) distribution, then what is the...dependencies may be present, then what is the magnitude of the estimation error? S The third specific aim will attempt to obtain bounds on the
NASA Astrophysics Data System (ADS)
Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng
2017-02-01
Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.
Tan, Chao; Zhao, Jia; Dong, Feng
2015-03-01
Flow behavior characterization is important to understand gas-liquid two-phase flow mechanics and further establish its description model. An Electrical Resistance Tomography (ERT) provides information regarding flow conditions at different directions where the sensing electrodes implemented. We extracted the multivariate sample entropy (MSampEn) by treating ERT data as a multivariate time series. The dynamic experimental results indicate that the MSampEn is sensitive to complexity change of flow patterns including bubbly flow, stratified flow, plug flow and slug flow. MSampEn can characterize the flow behavior at different direction of two-phase flow, and reveal the transition between flow patterns when flow velocity changes. The proposed method is effective to analyze two-phase flow pattern transition by incorporating information of different scales and different spatial directions. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Jinajai, Nattapong; Rattanavich, Saowalak
2015-01-01
This research aims to study the development of ninth grade students' reading and writing abilities and interests in learning English taught through computer-assisted instruction (CAI) based on the top-level structure (TLS) method. An experimental group time series design was used, and the data was analyzed by multivariate analysis of variance…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fast, J; Zhang, Q; Tilp, A
Significantly improved returns in their aerosol chemistry data can be achieved via the development of a value-added product (VAP) of deriving OA components, called Organic Aerosol Components (OACOMP). OACOMP is primarily based on multivariate analysis of the measured organic mass spectral matrix. The key outputs of OACOMP are the concentration time series and the mass spectra of OA factors that are associated with distinct sources, formation and evolution processes, and physicochemical properties.
Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes
2015-05-22
design a procedure of feature extraction in REACT named MEG (Mining Equivalence classes with shapelet Generators) based on the concept of...Equivalence Classes Mining [12, 15]. MEG can efficiently and effectively generate the discriminative features. In addition, several strategies are proposed...technique of parallel computing [4] to propose a process of pa- rallel MEG for substantially reducing the computational overhead of discovering shapelet
The Multivariate Largest Lyapunov Exponent as an Age-Related Metric of Quiet Standing Balance
Liu, Kun; Wang, Hongrui; Xiao, Jinzhuang
2015-01-01
The largest Lyapunov exponent has been researched as a metric of the balance ability during human quiet standing. However, the sensitivity and accuracy of this measurement method are not good enough for clinical use. The present research proposes a metric of the human body's standing balance ability based on the multivariate largest Lyapunov exponent which can quantify the human standing balance. The dynamic multivariate time series of ankle, knee, and hip were measured by multiple electrical goniometers. Thirty-six normal people of different ages participated in the test. With acquired data, the multivariate largest Lyapunov exponent was calculated. Finally, the results of the proposed approach were analysed and compared with the traditional method, for which the largest Lyapunov exponent and power spectral density from the centre of pressure were also calculated. The following conclusions can be obtained. The multivariate largest Lyapunov exponent has a higher degree of differentiation in differentiating balance in eyes-closed conditions. The MLLE value reflects the overall coordination between multisegment movements. Individuals of different ages can be distinguished by their MLLE values. The standing stability of human is reduced with the increment of age. PMID:26064182
Multivariate-$t$ nonlinear mixed models with application to censored multi-outcome AIDS studies.
Lin, Tsung-I; Wang, Wan-Lun
2017-10-01
In multivariate longitudinal HIV/AIDS studies, multi-outcome repeated measures on each patient over time may contain outliers, and the viral loads are often subject to a upper or lower limit of detection depending on the quantification assays. In this article, we consider an extension of the multivariate nonlinear mixed-effects model by adopting a joint multivariate-$t$ distribution for random effects and within-subject errors and taking the censoring information of multiple responses into account. The proposed model is called the multivariate-$t$ nonlinear mixed-effects model with censored responses (MtNLMMC), allowing for analyzing multi-outcome longitudinal data exhibiting nonlinear growth patterns with censorship and fat-tailed behavior. Utilizing the Taylor-series linearization method, a pseudo-data version of expectation conditional maximization either (ECME) algorithm is developed for iteratively carrying out maximum likelihood estimation. We illustrate our techniques with two data examples from HIV/AIDS studies. Experimental results signify that the MtNLMMC performs favorably compared to its Gaussian analogue and some existing approaches. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A generalized conditional heteroscedastic model for temperature downscaling
NASA Astrophysics Data System (ADS)
Modarres, R.; Ouarda, T. B. M. J.
2014-11-01
This study describes a method for deriving the time varying second order moment, or heteroscedasticity, of local daily temperature and its association to large Coupled Canadian General Circulation Models predictors. This is carried out by applying a multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) approach to construct the conditional variance-covariance structure between General Circulation Models (GCMs) predictors and maximum and minimum temperature time series during 1980-2000. Two MGARCH specifications namely diagonal VECH and dynamic conditional correlation (DCC) are applied and 25 GCM predictors were selected for a bivariate temperature heteroscedastic modeling. It is observed that the conditional covariance between predictors and temperature is not very strong and mostly depends on the interaction between the random process governing temporal variation of predictors and predictants. The DCC model reveals a time varying conditional correlation between GCM predictors and temperature time series. No remarkable increasing or decreasing change is observed for correlation coefficients between GCM predictors and observed temperature during 1980-2000 while weak winter-summer seasonality is clear for both conditional covariance and correlation. Furthermore, the stationarity and nonlinearity Kwiatkowski-Phillips-Schmidt-Shin (KPSS) and Brock-Dechert-Scheinkman (BDS) tests showed that GCM predictors, temperature and their conditional correlation time series are nonlinear but stationary during 1980-2000 according to BDS and KPSS test results. However, the degree of nonlinearity of temperature time series is higher than most of the GCM predictors.
MEMD-enhanced multivariate fuzzy entropy for the evaluation of complexity in biomedical signals.
Azami, Hamed; Smith, Keith; Escudero, Javier
2016-08-01
Multivariate multiscale entropy (mvMSE) has been proposed as a combination of the coarse-graining process and multivariate sample entropy (mvSE) to quantify the irregularity of multivariate signals. However, both the coarse-graining process and mvSE may not be reliable for short signals. Although the coarse-graining process can be replaced with multivariate empirical mode decomposition (MEMD), the relative instability of mvSE for short signals remains a problem. Here, we address this issue by proposing the multivariate fuzzy entropy (mvFE) with a new fuzzy membership function. The results using white Gaussian noise show that the mvFE leads to more reliable and stable results, especially for short signals, in comparison with mvSE. Accordingly, we propose MEMD-enhanced mvFE to quantify the complexity of signals. The characteristics of brain regions influenced by partial epilepsy are investigated by focal and non-focal electroencephalogram (EEG) time series. In this sense, the proposed MEMD-enhanced mvFE and mvSE are employed to discriminate focal EEG signals from non-focal ones. The results demonstrate the MEMD-enhanced mvFE values have a smaller coefficient of variation in comparison with those obtained by the MEMD-enhanced mvSE, even for long signals. The results also show that the MEMD-enhanced mvFE has better performance to quantify focal and non-focal signals compared with multivariate multiscale permutation entropy.
NASA Astrophysics Data System (ADS)
Gocheva-Ilieva, S.; Stoimenova, M.; Ivanov, A.; Voynikova, D.; Iliev, I.
2016-10-01
Fine particulate matter PM2.5 and PM10 air pollutants are a serious problem in many urban areas affecting both the health of the population and the environment as a whole. The availability of large data arrays for the levels of these pollutants makes it possible to perform statistical analysis, to obtain relevant information, and to find patterns within the data. Research in this field is particularly topical for a number of Bulgarian cities, European country, where in recent years regulatory air pollution health limits are constantly being exceeded. This paper examines average daily data for air pollution with PM2.5 and PM10, collected by 3 monitoring stations in the cities of Plovdiv and Asenovgrad between 2011 and 2016. The goal is to find and analyze actual relationships in data time series, to build adequate mathematical models, and to develop short-term forecasts. Modeling is carried out by stochastic univariate and multivariate time series analysis, based on Box-Jenkins methodology. The best models are selected following initial transformation of the data and using a set of standard and robust statistical criteria. The Mathematica and SPSS software were used to perform calculations. This examination showed measured concentrations of PM2.5 and PM10 in the region of Plovdiv and Asenovgrad regularly exceed permissible European and national health and safety thresholds. We obtained adequate stochastic models with high statistical fit with the data and good quality forecasting when compared against actual measurements. The mathematical approach applied provides an independent alternative to standard official monitoring and control means for air pollution in urban areas.
Predictive and Prognostic Factors in Definition of Risk Groups in Endometrial Carcinoma
Sorbe, Bengt
2012-01-01
Background. The aim was to evaluate predictive and prognostic factors in a large consecutive series of endometrial carcinomas and to discuss pre- and postoperative risk groups based on these factors. Material and Methods. In a consecutive series of 4,543 endometrial carcinomas predictive and prognostic factors were analyzed with regard to recurrence rate and survival. The patients were treated with primary surgery and adjuvant radiotherapy. Two preoperative and three postoperative risk groups were defined. DNA ploidy was included in the definitions. Eight predictive or prognostic factors were used in multivariate analyses. Results. The overall recurrence rate of the complete series was 11.4%. Median time to relapse was 19.7 months. In a multivariate logistic regression analysis, FIGO grade, myometrial infiltration, and DNA ploidy were independent and statistically predictive factors with regard to recurrence rate. The 5-year overall survival rate was 73%. Tumor stage was the single most important factor with FIGO grade on the second place. DNA ploidy was also a significant prognostic factor. In the preoperative risk group definitions three factors were used: histology, FIGO grade, and DNA ploidy. Conclusions. DNA ploidy was an important and significant predictive and prognostic factor and should be used both in preoperative and postoperative risk group definitions. PMID:23209924
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
1998-01-01
Pseudo-Maximum Likelihood (p-ML) and Asymptotically Distribution Free (ADF) estimation methods for estimating dynamic factor model parameters within a covariance structure framework were compared through a Monte Carlo simulation. Both methods appear to give consistent model parameter estimates, but only ADF gives standard errors and chi-square…
NASA Technical Reports Server (NTRS)
Callier, F. M.; Desoer, C. A.
1973-01-01
A class of multivariable, nonlinear time-varying feedback systems with an unstable convolution subsystem as feedforward and a time-varying nonlinear gain as feedback was considered. The impulse response of the convolution subsystem is the sum of a finite number of increasing exponentials multiplied by nonnegative powers of the time t, a term that is absolutely integrable and an infinite series of delayed impulses. The main result is a theorem. It essentially states that if the unstable convolution subsystem can be stabilized by a constant feedback gain F and if incremental gain of the difference between the nonlinear gain function and F is sufficiently small, then the nonlinear system is L(p)-stable for any p between one and infinity. Furthermore, the solutions of the nonlinear system depend continuously on the inputs in any L(p)-norm. The fixed point theorem is crucial in deriving the above theorem.
Trend Detection and Bivariate Frequency Analysis for Nonstrationary Rainfall Data
NASA Astrophysics Data System (ADS)
Joo, K.; Kim, H.; Shin, J. Y.; Heo, J. H.
2017-12-01
Multivariate frequency analysis has been developing for hydro-meteorological data such as rainfall, flood, and drought. Particularly, the copula has been used as a useful tool for multivariate probability model which has no limitation on deciding marginal distributions. The time-series rainfall data can be characterized to rainfall event by inter-event time definition (IETD) and each rainfall event has a rainfall depth and rainfall duration. In addition, nonstationarity in rainfall event has been studied recently due to climate change and trend detection of rainfall event is important to determine the data has nonstationarity or not. With the rainfall depth and duration of a rainfall event, trend detection and nonstationary bivariate frequency analysis has performed in this study. 62 stations from Korea Meteorological Association (KMA) over 30 years of hourly recorded data used in this study and the suitability of nonstationary copula for rainfall event has examined by the goodness-of-fit test.
A multiple-fan active control wind tunnel for outdoor wind speed and direction simulation
NASA Astrophysics Data System (ADS)
Wang, Jia-Ying; Meng, Qing-Hao; Luo, Bing; Zeng, Ming
2018-03-01
This article presents a new type of active controlled multiple-fan wind tunnel. The wind tunnel consists of swivel plates and arrays of direct current fans, and the rotation speed of each fan and the shaft angle of each swivel plate can be controlled independently for simulating different kinds of outdoor wind fields. To measure the similarity between the simulated wind field and the outdoor wind field, wind speed and direction time series of two kinds of wind fields are recorded by nine two-dimensional ultrasonic anemometers, and then statistical properties of the wind signals in different time scales are analyzed based on the empirical mode decomposition. In addition, the complexity of wind speed and direction time series is also investigated using multiscale entropy and multivariate multiscale entropy. Results suggest that the simulated wind field in the multiple-fan wind tunnel has a high degree of similarity with the outdoor wind field.
Dependency structure and scaling properties of financial time series are related
Morales, Raffaello; Di Matteo, T.; Aste, Tomaso
2014-01-01
We report evidence of a deep interplay between cross-correlations hierarchical properties and multifractality of New York Stock Exchange daily stock returns. The degree of multifractality displayed by different stocks is found to be positively correlated to their depth in the hierarchy of cross-correlations. We propose a dynamical model that reproduces this observation along with an array of other empirical properties. The structure of this model is such that the hierarchical structure of heterogeneous risks plays a crucial role in the time evolution of the correlation matrix, providing an interpretation to the mechanism behind the interplay between cross-correlation and multifractality in financial markets, where the degree of multifractality of stocks is associated to their hierarchical positioning in the cross-correlation structure. Empirical observations reported in this paper present a new perspective towards the merging of univariate multi scaling and multivariate cross-correlation properties of financial time series. PMID:24699417
Dependency structure and scaling properties of financial time series are related
NASA Astrophysics Data System (ADS)
Morales, Raffaello; Di Matteo, T.; Aste, Tomaso
2014-04-01
We report evidence of a deep interplay between cross-correlations hierarchical properties and multifractality of New York Stock Exchange daily stock returns. The degree of multifractality displayed by different stocks is found to be positively correlated to their depth in the hierarchy of cross-correlations. We propose a dynamical model that reproduces this observation along with an array of other empirical properties. The structure of this model is such that the hierarchical structure of heterogeneous risks plays a crucial role in the time evolution of the correlation matrix, providing an interpretation to the mechanism behind the interplay between cross-correlation and multifractality in financial markets, where the degree of multifractality of stocks is associated to their hierarchical positioning in the cross-correlation structure. Empirical observations reported in this paper present a new perspective towards the merging of univariate multi scaling and multivariate cross-correlation properties of financial time series.
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used.
Manikandan, Narayanan; Subha, Srinivasan
2016-01-01
Software development life cycle has been characterized by destructive disconnects between activities like planning, analysis, design, and programming. Particularly software developed with prediction based results is always a big challenge for designers. Time series data forecasting like currency exchange, stock prices, and weather report are some of the areas where an extensive research is going on for the last three decades. In the initial days, the problems with financial analysis and prediction were solved by statistical models and methods. For the last two decades, a large number of Artificial Neural Networks based learning models have been proposed to solve the problems of financial data and get accurate results in prediction of the future trends and prices. This paper addressed some architectural design related issues for performance improvement through vectorising the strengths of multivariate econometric time series models and Artificial Neural Networks. It provides an adaptive approach for predicting exchange rates and it can be called hybrid methodology for predicting exchange rates. This framework is tested for finding the accuracy and performance of parallel algorithms used. PMID:26881271
NASA Astrophysics Data System (ADS)
Gruszczynska, Marta; Rosat, Severine; Klos, Anna; Bogusz, Janusz
2017-04-01
Seasonal oscillations in the GPS position time series can arise from real geophysical effects and numerical artefacts. According to Dong et al. (2002) environmental loading effects can account for approximately 40% of the total variance of the annual signals in GPS time series, however using generally acknowledged methods (e.g. Least Squares Estimation, Wavelet Decomposition, Singular Spectrum Analysis) to model seasonal signals we are not able to separate real from spurious signals (effects of mismodelling aliased into annual period as well as draconitic). Therefore, we propose to use Multichannel Singular Spectrum Analysis (MSSA) to determine seasonal oscillations (with annual and semi-annual periods) from GPS position time series and environmental loading displacement models. The MSSA approach is an extension of the classical Karhunen-Loève method and it is a special case of SSA for multivariate time series. The main advantage of MSSA is the possibility to extract common seasonal signals for stations from selected area and to investigate the causality between a set of time series as well. In this research, we explored the ability of MSSA application to separate real geophysical effects from spurious effects in GPS time series. For this purpose, we used GPS position changes and environmental loading models. We analysed the topocentric time series from 250 selected stations located worldwide, delivered from Network Solution obtained by the International GNSS Service (IGS) as a contribution to the latest realization of the International Terrestrial Reference System (namely ITRF2014, Rebishung et al., 2016). We also researched atmospheric, hydrological and non-tidal oceanic loading models provided by the EOST/IPGS Loading Service in the Centre-of-Figure (CF) reference frame. The analysed displacements were estimated from ERA-Interim (surface pressure), MERRA-land (soil moisture and snow) as well as ECCO2 ocean bottom pressure. We used Multichannel Singular Spectrum Analysis to determine common seasonal signals in two case studies with adopted a 3-years lag-window as the optimal window size. We also inferred the statistical significance of oscillations through the Monte Carlo MSSA method (Allen and Robertson, 1996). In the first case study, we investigated the common spatio-temporal seasonal signals for all stations. For this purpose, we divided selected stations with respect to the continents. For instance, for stations located in Europe, seasonal oscillations accounts for approximately 45% of the GPS-derived data variance. Much higher variance of seasonal signals is explained by hydrological loadings of about 92%, while the non-tidal oceanic loading accounted for 31% of total variance. In the second case study, we analysed the capability of the MSSA method to establish a causality between several time series. Each of estimated Principal Component represents pattern of the common signal for all analysed data. For ZIMM station (Zimmerwald, Switzerland), the 1st, 2nd and 9th, 10th Principal Components, which accounts for 35% of the variance, corresponds to the annual and semi-annual signals. In this part, we applied the non-parametric MSSA approach to extract the common seasonal signals for GPS time series and environmental loadings for each of the 250 stations with clear statement, that some part of seasonal signal reflects the real geophysical effects. REFERENCES: 1. Allen, M. and Robertson, A.: 1996, Distinguishing modulated oscillations from coloured noise in multivariate datasets. Climate Dynamics, 12, No. 11, 775-784. DOI: 10.1007/s003820050142. 2. Dong, D., Fang, P., Bock, Y., Cheng, M.K. and Miyazaki, S.: 2002, Anatomy of apparent seasonal variations from GPS-derived site position time series. Journal of Geophysical Research, 107, No. B4, 2075. DOI: 10.1029/2001JB000573. 3. Rebischung, P., Altamimi, Z., Ray, J. and Garayt, B.: 2016, The IGS contribution to ITRF2014. Journal of Geodesy, 90, No. 7, 611-630. DOI:10.1007/s00190-016-0897-6.
Towards a New Generation of Time-Series Visualization Tools in the ESA Heliophysics Science Archives
NASA Astrophysics Data System (ADS)
Perez, H.; Martinez, B.; Cook, J. P.; Herment, D.; Fernandez, M.; De Teodoro, P.; Arnaud, M.; Middleton, H. R.; Osuna, P.; Arviset, C.
2017-12-01
During the last decades a varied set of Heliophysics missions have allowed the scientific community to gain a better knowledge on the solar atmosphere and activity. The remote sensing images of missions such as SOHO have paved the ground for Helio-based spatial data visualization software such as JHelioViewer/Helioviewer. On the other hand, the huge amount of in-situ measurements provided by other missions such as Cluster provide a wide base for plot visualization software whose reach is still far from being fully exploited. The Heliophysics Science Archives within the ESAC Science Data Center (ESDC) already provide a first generation of tools for time-series visualization focusing on each mission's needs: visualization of quicklook plots, cross-calibration time series, pre-generated/on-demand multi-plot stacks (Cluster), basic plot zoom in/out options (Ulysses) and easy navigation through the plots in time (Ulysses, Cluster, ISS-Solaces). However, as the needs evolve and the scientists involved in new missions require to plot multi-variable data, heat maps stacks interactive synchronization and axis variable selection among other improvements. The new Heliophysics archives (such as Solar Orbiter) and the evolution of existing ones (Cluster) intend to address these new challenges. This paper provides an overview of the different approaches for visualizing time-series followed within the ESA Heliophysics Archives and their foreseen evolution.
Multivariate Markov chain modeling for stock markets
NASA Astrophysics Data System (ADS)
Maskawa, Jun-ichi
2003-06-01
We study a multivariate Markov chain model as a stochastic model of the price changes of portfolios in the framework of the mean field approximation. The time series of price changes are coded into the sequences of up and down spins according to their signs. We start with the discussion for small portfolios consisting of two stock issues. The generalization of our model to arbitrary size of portfolio is constructed by a recurrence relation. The resultant form of the joint probability of the stationary state coincides with Gibbs measure assigned to each configuration of spin glass model. Through the analysis of actual portfolios, it has been shown that the synchronization of the direction of the price changes is well described by the model.
Mathematical models for exploring different aspects of genotoxicity and carcinogenicity databases.
Benigni, R; Giuliani, A
1991-12-01
One great obstacle to understanding and using the information contained in the genotoxicity and carcinogenicity databases is the very size of such databases. Their vastness makes them difficult to read; this leads to inadequate exploitation of the information, which becomes costly in terms of time, labor, and money. In its search for adequate approaches to the problem, the scientific community has, curiously, almost entirely neglected an existent series of very powerful methods of data analysis: the multivariate data analysis techniques. These methods were specifically designed for exploring large data sets. This paper presents the multivariate techniques and reports a number of applications to genotoxicity problems. These studies show how biology and mathematical modeling can be combined and how successful this combination is.
Analysis models for the estimation of oceanic fields
NASA Technical Reports Server (NTRS)
Carter, E. F.; Robinson, A. R.
1987-01-01
A general model for statistically optimal estimates is presented for dealing with scalar, vector and multivariate datasets. The method deals with anisotropic fields and treats space and time dependence equivalently. Problems addressed include the analysis, or the production of synoptic time series of regularly gridded fields from irregular and gappy datasets, and the estimate of fields by compositing observations from several different instruments and sampling schemes. Technical issues are discussed, including the convergence of statistical estimates, the choice of representation of the correlations, the influential domain of an observation, and the efficiency of numerical computations.
Modelling world gold prices and USD foreign exchange relationship using multivariate GARCH model
NASA Astrophysics Data System (ADS)
Ping, Pung Yean; Ahmad, Maizah Hura Binti
2014-12-01
World gold price is a popular investment commodity. The series have often been modeled using univariate models. The objective of this paper is to show that there is a co-movement between gold price and USD foreign exchange rate. Using the effect of the USD foreign exchange rate on the gold price, a model that can be used to forecast future gold prices is developed. For this purpose, the current paper proposes a multivariate GARCH (Bivariate GARCH) model. Using daily prices of both series from 01.01.2000 to 05.05.2014, a causal relation between the two series understudied are found and a bivariate GARCH model is produced.
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
NASA Technical Reports Server (NTRS)
Scargle, Jeffrey D.; Norris, Jay P.; Jackson, Brad; Chiang, James
2013-01-01
This paper addresses the problem of detecting and characterizing local variability in time series and other forms of sequential data. The goal is to identify and characterize statistically significant variations, at the same time suppressing the inevitable corrupting observational errors. We present a simple nonparametric modeling technique and an algorithm implementing it-an improved and generalized version of Bayesian Blocks [Scargle 1998]-that finds the optimal segmentation of the data in the observation interval. The structure of the algorithm allows it to be used in either a real-time trigger mode, or a retrospective mode. Maximum likelihood or marginal posterior functions to measure model fitness are presented for events, binned counts, and measurements at arbitrary times with known error distributions. Problems addressed include those connected with data gaps, variable exposure, extension to piece- wise linear and piecewise exponential representations, multivariate time series data, analysis of variance, data on the circle, other data modes, and dispersed data. Simulations provide evidence that the detection efficiency for weak signals is close to a theoretical asymptotic limit derived by [Arias-Castro, Donoho and Huo 2003]. In the spirit of Reproducible Research [Donoho et al. (2008)] all of the code and data necessary to reproduce all of the figures in this paper are included as auxiliary material.
Similar resilience attributes in lakes with different management practices
Baho, Didier L.; Drakare, Stina; Johnson, Richard K.; Allen, Craig R.; Angeler, David G.
2014-01-01
Liming has been used extensively in Scandinavia and elsewhere since the 1970s to counteract the negative effects of acidification. Communities in limed lakes usually return to acidified conditions once liming is discontinued, suggesting that liming is unlikely to shift acidified lakes to a state equivalent to pre-acidification conditions that requires no further management intervention. While this suggests a low resilience of limed lakes, attributes that confer resilience have not been assessed, limiting our understanding of the efficiency of costly management programs. In this study, we assessed community metrics (diversity, richness, evenness, biovolume), multivariate community structure and the relative resilience of phytoplankton in limed, acidified and circum-neutral lakes from 1997 to 2009, using multivariate time series modeling. We identified dominant temporal frequencies in the data, allowing us to track community change at distinct temporal scales. We assessed two attributes of relative resilience (cross-scale and within-scale structure) of the phytoplankton communities, based on the fluctuation frequency patterns identified. We also assessed species with stochastic temporal dynamics. Liming increased phytoplankton diversity and richness; however, multivariate community structure differed in limed relative to acidified and circum-neutral lakes. Cross-scale and within-scale attributes of resilience were similar across all lakes studied but the contribution of those species exhibiting stochastic dynamics was higher in the acidified and limed compared to circum-neutral lakes. From a resilience perspective, our results suggest that limed lakes comprise a particular condition of an acidified lake state. This explains why liming does not move acidified lakes out of a “degraded” basin of attraction. In addition, our study demonstrates the potential of time series modeling to assess the efficiency of restoration and management outcomes through quantification of the attributes contributing to resilience in ecosystems.
ERIC Educational Resources Information Center
Trussell, James; Pebley, Anne R.
The relationship between changes in the timing and quantity of fertility, such as those that might result from an effective family planning program in developing countries, and changes in child and maternal mortality is examined. Results from five multivariate studies estimate the changes in mortality that might occur from altering maternal age,…
A lengthy look at the daily grind: time series analysis of events, mood, stress, and satisfaction.
Fuller, Julie A; Stanton, Jeffrey M; Fisher, Gwenith G; Spitzmuller, Christiane; Russell, Steven S; Smith, Patricia C
2003-12-01
The present study investigated processes by which job stress and satisfaction unfold over time by examining the relations between daily stressful events, mood, and these variables. Using a Web-based daily survey of stressor events, perceived strain, mood, and job satisfaction completed by 14 university workers, 1,060 occasions of data were collected. Transfer function analysis, a multivariate version of time series analysis, was used to examine the data for relationships among the measured variables after factoring out the contaminating influences of serial dependency. Results revealed a contrast effect in which a stressful event associated positively with higher strain on the same day and associated negatively with strain on the following day. Perceived strain increased over the course of a semester for a majority of participants, suggesting that effects of stress build over time. Finally, the data were consistent with the notion that job satisfaction is a distal outcome that is mediated by perceived strain. ((c) 2003 APA, all rights reserved)
Faes, Luca; Nollo, Giandomenico
2010-11-01
The Partial Directed Coherence (PDC) and its generalized formulation (gPDC) are popular tools for investigating, in the frequency domain, the concept of Granger causality among multivariate (MV) time series. PDC and gPDC are formalized in terms of the coefficients of an MV autoregressive (MVAR) model which describes only the lagged effects among the time series and forsakes instantaneous effects. However, instantaneous effects are known to affect linear parametric modeling, and are likely to occur in experimental time series. In this study, we investigate the impact on the assessment of frequency domain causality of excluding instantaneous effects from the model underlying PDC evaluation. Moreover, we propose the utilization of an extended MVAR model including both instantaneous and lagged effects. This model is used to assess PDC either in accordance with the definition of Granger causality when considering only lagged effects (iPDC), or with an extended form of causality, when we consider both instantaneous and lagged effects (ePDC). The approach is first evaluated on three theoretical examples of MVAR processes, which show that the presence of instantaneous correlations may produce misleading profiles of PDC and gPDC, while ePDC and iPDC derived from the extended model provide here a correct interpretation of extended and lagged causality. It is then applied to representative examples of cardiorespiratory and EEG MV time series. They suggest that ePDC and iPDC are better interpretable than PDC and gPDC in terms of the known cardiovascular and neural physiologies.
Fourier Series Optimization Opportunity
ERIC Educational Resources Information Center
Winkel, Brian
2008-01-01
This note discusses the introduction of Fourier series as an immediate application of optimization of a function of more than one variable. Specifically, it is shown how the study of Fourier series can be motivated to enrich a multivariable calculus class. This is done through discovery learning and use of technology wherein students build the…
NASA Astrophysics Data System (ADS)
Giammanco, S.; Ferrera, E.; Cannata, A.; Montalto, P.; Neri, M.
2013-12-01
From November 2009 to April 2011 soil radon activity was continuously monitored using a Barasol probe located on the upper NE flank of Mt. Etna volcano (Italy), close both to the Piano Provenzana fault and to the NE-Rift. Seismic, volcanological and radon data were analysed together with data on environmental parameters, such as air and soil temperature, barometric pressure, snow and rain fall. In order to find possible correlations among the above parameters, and hence to reveal possible anomalous trends in the radon time-series, we used different statistical methods: i) multivariate linear regression; ii) cross-correlation; iii) coherence analysis through wavelet transform. Multivariate regression indicated a modest influence on soil radon from environmental parameters (R2 = 0.31). When using 100-day time windows, the R2 values showed wide variations in time, reaching their maxima (~0.63-0.66) during summer. Cross-correlation analysis over 100-day moving averages showed that, similar to multivariate linear regression analysis, the summer period was characterised by the best correlation between radon data and environmental parameters. Lastly, the wavelet coherence analysis allowed a multi-resolution coherence analysis of the time series acquired. This approach allowed to study the relations among different signals either in the time or in the frequency domain. It confirmed the results of the previous methods, but also allowed to recognize correlations between radon and environmental parameters at different observation scales (e.g., radon activity changed during strong precipitations, but also during anomalous variations of soil temperature uncorrelated with seasonal fluctuations). Using the above analysis, two periods were recognized when radon variations were significantly correlated with marked soil temperature changes and also with local seismic or volcanic activity. This allowed to produce two different physical models of soil gas transport that explain the observed anomalies. Our work suggests that in order to make an accurate analysis of the relations among different signals it is necessary to use different techniques that give complementary analytical information. In particular, the wavelet analysis showed to be the most effective in discriminating radon changes due to environmental influences from those correlated with impending seismic or volcanic events.
Levine, Matthew E; Albers, David J; Hripcsak, George
2016-01-01
Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
Enhancing e-waste estimates: Improving data quality by multivariate Input–Output Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Feng, E-mail: fwang@unu.edu; Design for Sustainability Lab, Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, 2628CE Delft; Huisman, Jaco
2013-11-15
Highlights: • A multivariate Input–Output Analysis method for e-waste estimates is proposed. • Applying multivariate analysis to consolidate data can enhance e-waste estimates. • We examine the influence of model selection and data quality on e-waste estimates. • Datasets of all e-waste related variables in a Dutch case study have been provided. • Accurate modeling of time-variant lifespan distributions is critical for estimate. - Abstract: Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lackmore » of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.« less
Valdés, Julio J; Bonham-Carter, Graeme
2006-03-01
A computational intelligence approach is used to explore the problem of detecting internal state changes in time dependent processes; described by heterogeneous, multivariate time series with imprecise data and missing values. Such processes are approximated by collections of time dependent non-linear autoregressive models represented by a special kind of neuro-fuzzy neural network. Grid and high throughput computing model mining procedures based on neuro-fuzzy networks and genetic algorithms, generate: (i) collections of models composed of sets of time lag terms from the time series, and (ii) prediction functions represented by neuro-fuzzy networks. The composition of the models and their prediction capabilities, allows the identification of changes in the internal structure of the process. These changes are associated with the alternation of steady and transient states, zones with abnormal behavior, instability, and other situations. This approach is general, and its sensitivity for detecting subtle changes of state is revealed by simulation experiments. Its potential in the study of complex processes in earth sciences and astrophysics is illustrated with applications using paleoclimate and solar data.
Santori, G; Fontana, I; Bertocchi, M; Gasloli, G; Magoni Rossi, A; Tagliamacco, A; Barocci, S; Nocera, A; Valente, U
2010-05-01
A useful approach to reduce the number of discarded marginal kidneys and to increase the nephron mass is double kidney transplantation (DKT). In this study, we retrospectively evaluated the potential predictors for patient and graft survival in a single-center series of 59 DKT procedures performed between April 21, 1999, and September 21, 2008. The kidney recipients of mean age 63.27 +/- 5.17 years included 16 women (27%) and 43 men (73%). The donors of mean age 69.54 +/- 7.48 years included 32 women (54%) and 27 men (46%). The mean posttransplant dialysis time was 2.37 +/- 3.61 days. The mean hospitalization was 20.12 +/- 13.65 days. Average serum creatinine (SCr) at discharge was 1.5 +/- 0.59 mg/dL. In view of the limited numbers of recipient deaths (n = 4) and graft losses (n = 8) that occurred in our series, the proportional hazards assumption for each Cox regression model with P < .05 was tested by using correlation coefficients between transformed survival times and scaled Schoenfeld residuals, and checked with smoothed plots of Schoenfeld residuals. For patient survival, the variables that reached statistical significance were donor SCr (P = .007), donor creatinine cleararance (P = .023), and recipient age (P = .047). Each significant model passed the Schoenfeld test. By entering these variables into a multivariate Cox model for patient survival, no further significance was observed. In the univariate Cox models performed for graft survival, statistical significance was noted for donor SCr (P = .027), SCr 3 months post-DKT (P = .043), and SCr 6 months post-DKT (P = .017). All significant univariate models for graft survival passed the Schoenfeld test. A final multivariate model retained SCr at 6 months (beta = 1.746, P = .042) and donor SCr (beta = .767, P = .090). In our analysis, SCr at 6 months seemed to emerge from both univariate and multivariate Cox models as a potential predictor of graft survival among DKT. Multicenter studies with larger recipient populations and more graft losses should be performed to confirm our findings. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Testing for significance of phase synchronisation dynamics in the EEG.
Daly, Ian; Sweeney-Reed, Catherine M; Nasuto, Slawomir J
2013-06-01
A number of tests exist to check for statistical significance of phase synchronisation within the Electroencephalogram (EEG); however, the majority suffer from a lack of generality and applicability. They may also fail to account for temporal dynamics in the phase synchronisation, regarding synchronisation as a constant state instead of a dynamical process. Therefore, a novel test is developed for identifying the statistical significance of phase synchronisation based upon a combination of work characterising temporal dynamics of multivariate time-series and Markov modelling. We show how this method is better able to assess the significance of phase synchronisation than a range of commonly used significance tests. We also show how the method may be applied to identify and classify significantly different phase synchronisation dynamics in both univariate and multivariate datasets.
Modelling spatiotemporal change using multidimensional arrays Meng
NASA Astrophysics Data System (ADS)
Lu, Meng; Appel, Marius; Pebesma, Edzer
2017-04-01
The large variety of remote sensors, model simulations, and in-situ records provide great opportunities to model environmental change. The massive amount of high-dimensional data calls for methods to integrate data from various sources and to analyse spatiotemporal and thematic information jointly. An array is a collection of elements ordered and indexed in arbitrary dimensions, which naturally represent spatiotemporal phenomena that are identified by their geographic locations and recording time. In addition, array regridding (e.g., resampling, down-/up-scaling), dimension reduction, and spatiotemporal statistical algorithms are readily applicable to arrays. However, the role of arrays in big geoscientific data analysis has not been systematically studied: How can arrays discretise continuous spatiotemporal phenomena? How can arrays facilitate the extraction of multidimensional information? How can arrays provide a clean, scalable and reproducible change modelling process that is communicable between mathematicians, computer scientist, Earth system scientist and stakeholders? This study emphasises on detecting spatiotemporal change using satellite image time series. Current change detection methods using satellite image time series commonly analyse data in separate steps: 1) forming a vegetation index, 2) conducting time series analysis on each pixel, and 3) post-processing and mapping time series analysis results, which does not consider spatiotemporal correlations and ignores much of the spectral information. Multidimensional information can be better extracted by jointly considering spatial, spectral, and temporal information. To approach this goal, we use principal component analysis to extract multispectral information and spatial autoregressive models to account for spatial correlation in residual based time series structural change modelling. We also discuss the potential of multivariate non-parametric time series structural change methods, hierarchical modelling, and extreme event detection methods to model spatiotemporal change. We show how array operations can facilitate expressing these methods, and how the open-source array data management and analytics software SciDB and R can be used to scale the process and make it easily reproducible.
2011-01-01
Background Thousands of children experience cardiac arrest events every year in pediatric intensive care units. Most of these children die. Cardiac arrest prediction tools are used as part of medical emergency team evaluations to identify patients in standard hospital beds that are at high risk for cardiac arrest. There are no models to predict cardiac arrest in pediatric intensive care units though, where the risk of an arrest is 10 times higher than for standard hospital beds. Current tools are based on a multivariable approach that does not characterize deterioration, which often precedes cardiac arrests. Characterizing deterioration requires a time series approach. The purpose of this study is to propose a method that will allow for time series data to be used in clinical prediction models. Successful implementation of these methods has the potential to bring arrest prediction to the pediatric intensive care environment, possibly allowing for interventions that can save lives and prevent disabilities. Methods We reviewed prediction models from nonclinical domains that employ time series data, and identified the steps that are necessary for building predictive models using time series clinical data. We illustrate the method by applying it to the specific case of building a predictive model for cardiac arrest in a pediatric intensive care unit. Results Time course analysis studies from genomic analysis provided a modeling template that was compatible with the steps required to develop a model from clinical time series data. The steps include: 1) selecting candidate variables; 2) specifying measurement parameters; 3) defining data format; 4) defining time window duration and resolution; 5) calculating latent variables for candidate variables not directly measured; 6) calculating time series features as latent variables; 7) creating data subsets to measure model performance effects attributable to various classes of candidate variables; 8) reducing the number of candidate features; 9) training models for various data subsets; and 10) measuring model performance characteristics in unseen data to estimate their external validity. Conclusions We have proposed a ten step process that results in data sets that contain time series features and are suitable for predictive modeling by a number of methods. We illustrated the process through an example of cardiac arrest prediction in a pediatric intensive care setting. PMID:22023778
Kennedy, Curtis E; Turley, James P
2011-10-24
Thousands of children experience cardiac arrest events every year in pediatric intensive care units. Most of these children die. Cardiac arrest prediction tools are used as part of medical emergency team evaluations to identify patients in standard hospital beds that are at high risk for cardiac arrest. There are no models to predict cardiac arrest in pediatric intensive care units though, where the risk of an arrest is 10 times higher than for standard hospital beds. Current tools are based on a multivariable approach that does not characterize deterioration, which often precedes cardiac arrests. Characterizing deterioration requires a time series approach. The purpose of this study is to propose a method that will allow for time series data to be used in clinical prediction models. Successful implementation of these methods has the potential to bring arrest prediction to the pediatric intensive care environment, possibly allowing for interventions that can save lives and prevent disabilities. We reviewed prediction models from nonclinical domains that employ time series data, and identified the steps that are necessary for building predictive models using time series clinical data. We illustrate the method by applying it to the specific case of building a predictive model for cardiac arrest in a pediatric intensive care unit. Time course analysis studies from genomic analysis provided a modeling template that was compatible with the steps required to develop a model from clinical time series data. The steps include: 1) selecting candidate variables; 2) specifying measurement parameters; 3) defining data format; 4) defining time window duration and resolution; 5) calculating latent variables for candidate variables not directly measured; 6) calculating time series features as latent variables; 7) creating data subsets to measure model performance effects attributable to various classes of candidate variables; 8) reducing the number of candidate features; 9) training models for various data subsets; and 10) measuring model performance characteristics in unseen data to estimate their external validity. We have proposed a ten step process that results in data sets that contain time series features and are suitable for predictive modeling by a number of methods. We illustrated the process through an example of cardiac arrest prediction in a pediatric intensive care setting.
Cabrieto, Jedelyn; Tuerlinckx, Francis; Kuppens, Peter; Grassmann, Mariel; Ceulemans, Eva
2017-06-01
Change point detection in multivariate time series is a complex task since next to the mean, the correlation structure of the monitored variables may also alter when change occurs. DeCon was recently developed to detect such changes in mean and\\or correlation by combining a moving windows approach and robust PCA. However, in the literature, several other methods have been proposed that employ other non-parametric tools: E-divisive, Multirank, and KCP. Since these methods use different statistical approaches, two issues need to be tackled. First, applied researchers may find it hard to appraise the differences between the methods. Second, a direct comparison of the relative performance of all these methods for capturing change points signaling correlation changes is still lacking. Therefore, we present the basic principles behind DeCon, E-divisive, Multirank, and KCP and the corresponding algorithms, to make them more accessible to readers. We further compared their performance through extensive simulations using the settings of Bulteel et al. (Biological Psychology, 98 (1), 29-42, 2014) implying changes in mean and in correlation structure and those of Matteson and James (Journal of the American Statistical Association, 109 (505), 334-345, 2014) implying different numbers of (noise) variables. KCP emerged as the best method in almost all settings. However, in case of more than two noise variables, only DeCon performed adequately in detecting correlation changes.
Unraveling spurious properties of interaction networks with tailored random networks.
Bialonski, Stephan; Wendler, Martin; Lehnertz, Klaus
2011-01-01
We investigate interaction networks that we derive from multivariate time series with methods frequently employed in diverse scientific fields such as biology, quantitative finance, physics, earth and climate sciences, and the neurosciences. Mimicking experimental situations, we generate time series with finite length and varying frequency content but from independent stochastic processes. Using the correlation coefficient and the maximum cross-correlation, we estimate interdependencies between these time series. With clustering coefficient and average shortest path length, we observe unweighted interaction networks, derived via thresholding the values of interdependence, to possess non-trivial topologies as compared to Erdös-Rényi networks, which would indicate small-world characteristics. These topologies reflect the mostly unavoidable finiteness of the data, which limits the reliability of typically used estimators of signal interdependence. We propose random networks that are tailored to the way interaction networks are derived from empirical data. Through an exemplary investigation of multichannel electroencephalographic recordings of epileptic seizures--known for their complex spatial and temporal dynamics--we show that such random networks help to distinguish network properties of interdependence structures related to seizure dynamics from those spuriously induced by the applied methods of analysis.
Unraveling Spurious Properties of Interaction Networks with Tailored Random Networks
Bialonski, Stephan; Wendler, Martin; Lehnertz, Klaus
2011-01-01
We investigate interaction networks that we derive from multivariate time series with methods frequently employed in diverse scientific fields such as biology, quantitative finance, physics, earth and climate sciences, and the neurosciences. Mimicking experimental situations, we generate time series with finite length and varying frequency content but from independent stochastic processes. Using the correlation coefficient and the maximum cross-correlation, we estimate interdependencies between these time series. With clustering coefficient and average shortest path length, we observe unweighted interaction networks, derived via thresholding the values of interdependence, to possess non-trivial topologies as compared to Erdös-Rényi networks, which would indicate small-world characteristics. These topologies reflect the mostly unavoidable finiteness of the data, which limits the reliability of typically used estimators of signal interdependence. We propose random networks that are tailored to the way interaction networks are derived from empirical data. Through an exemplary investigation of multichannel electroencephalographic recordings of epileptic seizures – known for their complex spatial and temporal dynamics – we show that such random networks help to distinguish network properties of interdependence structures related to seizure dynamics from those spuriously induced by the applied methods of analysis. PMID:21850239
VizieR Online Data Catalog: RR Lyrae in SDSS Stripe 82 (Suveges+, 2012)
NASA Astrophysics Data System (ADS)
Suveges, M.; Sesar, B.; Varadi, M.; Mowlavi, N.; Becker, A. C.; Ivezic, Z.; Beck, M.; Nienartowicz, K.; Rimoldini, L.; Dubath, P.; Bartholdi, P.; Eyer, L.
2013-05-01
We propose a robust principal component analysis framework for the exploitation of multiband photometric measurements in large surveys. Period search results are improved using the time-series of the first principal component due to its optimized signal-to-noise ratio. The presence of correlated excess variations in the multivariate time-series enables the detection of weaker variability. Furthermore, the direction of the largest variance differs for certain types of variable stars. This can be used as an efficient attribute for classification. The application of the method to a subsample of Sloan Digital Sky Survey Stripe 82 data yielded 132 high-amplitude delta Scuti variables. We also found 129 new RR Lyrae variables, complementary to the catalogue of Sesar et al., extending the halo area mapped by Stripe 82 RR Lyrae stars towards the Galactic bulge. The sample also comprises 25 multiperiodic or Blazhko RR Lyrae stars. (8 data files).
Temporal abstraction for the analysis of intensive care information
NASA Astrophysics Data System (ADS)
Hadad, Alejandro J.; Evin, Diego A.; Drozdowicz, Bartolomé; Chiotti, Omar
2007-11-01
This paper proposes a scheme for the analysis of time-stamped series data from multiple monitoring devices of intensive care units, using Temporal Abstraction concepts. This scheme is oriented to obtain a description of the patient state evolution in an unsupervised way. The case of study is based on a dataset clinically classified with Pulmonary Edema. For this dataset a trends based Temporal Abstraction mechanism is proposed, by means of a Behaviours Base of time-stamped series and then used in a classification step. Combining this approach with the introduction of expert knowledge, using Fuzzy Logic, and multivariate analysis by means of Self-Organizing Maps, a states characterization model is obtained. This model is feasible of being extended to different patients groups and states. The proposed scheme allows to obtain intermediate states descriptions through which it is passing the patient and that could be used to anticipate alert situations.
Workshop on Algorithms for Time-Series Analysis
NASA Astrophysics Data System (ADS)
Protopapas, Pavlos
2012-04-01
abstract-type="normal">SummaryThis Workshop covered the four major subjects listed below in two 90-minute sessions. Each talk or tutorial allowed questions, and concluded with a discussion. Classification: Automatic classification using machine-learning methods is becoming a standard in surveys that generate large datasets. Ashish Mahabal (Caltech) reviewed various methods, and presented examples of several applications. Time-Series Modelling: Suzanne Aigrain (Oxford University) discussed autoregressive models and multivariate approaches such as Gaussian Processes. Meta-classification/mixture of expert models: Karim Pichara (Pontificia Universidad Católica, Chile) described the substantial promise which machine-learning classification methods are now showing in automatic classification, and discussed how the various methods can be combined together. Event Detection: Pavlos Protopapas (Harvard) addressed methods of fast identification of events with low signal-to-noise ratios, enlarging on the characterization and statistical issues of low signal-to-noise ratios and rare events.
NASA Astrophysics Data System (ADS)
Gerard-Marchant, P. G.
2008-12-01
Numpy is a free, open source C/Python interface designed for the fast and convenient manipulation of multidimensional numerical arrays. The base object, ndarray, can also be easily be extended to define new objects meeting specific needs. Thanks to its simplicity, efficiency and modularity, numpy and its companion library Scipy have become increasingly popular in the scientific community over the last few years, with application ranging from astronomy and engineering to finances and statistics. Its capacity to handle missing values is particularly appealing when analyzing environmental time series, where irregular data sampling might be an issue. After reviewing the main characteristics of numpy objects and the mechanism of subclassing, we will present the scikits.timeseries package, developed to manipulate single- and multi-variable arrays indexed in time. We will illustrate some typical applications of this package by introducing climpy, a set of extensions designed to help analyzing the impacts of climate variability on environmental data such as precipitations or streamflows.
Comparative case study between D3 and highcharts on lustre data visualization
NASA Astrophysics Data System (ADS)
ElTayeby, Omar; John, Dwayne; Patel, Pragnesh; Simmerman, Scott
2013-12-01
One of the challenging tasks in visual analytics is to target clustered time-series data sets, since it is important for data analysts to discover patterns changing over time while keeping their focus on particular subsets. In order to leverage the humans ability to quickly visually perceive these patterns, multivariate features should be implemented according to the attributes available. However, a comparative case study has been done using JavaScript libraries to demonstrate the differences in capabilities of using them. A web-based application to monitor the Lustre file system for the systems administrators and the operation teams has been developed using D3 and Highcharts. Lustre file systems are responsible of managing Remote Procedure Calls (RPCs) which include input output (I/O) requests between clients and Object Storage Targets (OSTs). The objective of this application is to provide time-series visuals of these calls and storage patterns of users on Kraken, a University of Tennessee High Performance Computing (HPC) resource in Oak Ridge National Laboratory (ORNL).
NASA Astrophysics Data System (ADS)
Goodwell, Allison E.; Kumar, Praveen
2017-07-01
In an ecohydrologic system, components of atmospheric, vegetation, and root-soil subsystems participate in forcing and feedback interactions at varying time scales and intensities. The structure of this network of complex interactions varies in terms of connectivity, strength, and time scale due to perturbations or changing conditions such as rainfall, drought, or land use. However, characterization of these interactions is difficult due to multivariate and weak dependencies in the presence of noise, nonlinearities, and limited data. We introduce a framework for Temporal Information Partitioning Networks (TIPNets), in which time-series variables are viewed as nodes, and lagged multivariate mutual information measures are links. These links are partitioned into synergistic, unique, and redundant information components, where synergy is information provided only jointly, unique information is only provided by a single source, and redundancy is overlapping information. We construct TIPNets from 1 min weather station data over several hour time windows. From a comparison of dry, wet, and rainy conditions, we find that information strengths increase when solar radiation and surface moisture are present, and surface moisture and wind variability are redundant and synergistic influences, respectively. Over a growing season, network trends reveal patterns that vary with vegetation and rainfall patterns. The framework presented here enables us to interpret process connectivity in a multivariate context, which can lead to better inference of behavioral shifts due to perturbations in ecohydrologic systems. This work contributes to more holistic characterizations of system behavior, and can benefit a wide variety of studies of complex systems.
DigOut: viewing differential expression genes as outliers.
Yu, Hui; Tu, Kang; Xie, Lu; Li, Yuan-Yuan
2010-12-01
With regards to well-replicated two-conditional microarray datasets, the selection of differentially expressed (DE) genes is a well-studied computational topic, but for multi-conditional microarray datasets with limited or no replication, the same task is not properly addressed by previous studies. This paper adopts multivariate outlier analysis to analyze replication-lacking multi-conditional microarray datasets, finding that it performs significantly better than the widely used limit fold change (LFC) model in a simulated comparative experiment. Compared with the LFC model, the multivariate outlier analysis also demonstrates improved stability against sample variations in a series of manipulated real expression datasets. The reanalysis of a real non-replicated multi-conditional expression dataset series leads to satisfactory results. In conclusion, a multivariate outlier analysis algorithm, like DigOut, is particularly useful for selecting DE genes from non-replicated multi-conditional gene expression dataset.
Revealing Real-Time Emotional Responses: a Personalized Assessment based on Heartbeat Dynamics
NASA Astrophysics Data System (ADS)
Valenza, Gaetano; Citi, Luca; Lanatá, Antonio; Scilingo, Enzo Pasquale; Barbieri, Riccardo
2014-05-01
Emotion recognition through computational modeling and analysis of physiological signals has been widely investigated in the last decade. Most of the proposed emotion recognition systems require relatively long-time series of multivariate records and do not provide accurate real-time characterizations using short-time series. To overcome these limitations, we propose a novel personalized probabilistic framework able to characterize the emotional state of a subject through the analysis of heartbeat dynamics exclusively. The study includes thirty subjects presented with a set of standardized images gathered from the international affective picture system, alternating levels of arousal and valence. Due to the intrinsic nonlinearity and nonstationarity of the RR interval series, a specific point-process model was devised for instantaneous identification considering autoregressive nonlinearities up to the third-order according to the Wiener-Volterra representation, thus tracking very fast stimulus-response changes. Features from the instantaneous spectrum and bispectrum, as well as the dominant Lyapunov exponent, were extracted and considered as input features to a support vector machine for classification. Results, estimating emotions each 10 seconds, achieve an overall accuracy in recognizing four emotional states based on the circumplex model of affect of 79.29%, with 79.15% on the valence axis, and 83.55% on the arousal axis.
Revealing real-time emotional responses: a personalized assessment based on heartbeat dynamics.
Valenza, Gaetano; Citi, Luca; Lanatá, Antonio; Scilingo, Enzo Pasquale; Barbieri, Riccardo
2014-05-21
Emotion recognition through computational modeling and analysis of physiological signals has been widely investigated in the last decade. Most of the proposed emotion recognition systems require relatively long-time series of multivariate records and do not provide accurate real-time characterizations using short-time series. To overcome these limitations, we propose a novel personalized probabilistic framework able to characterize the emotional state of a subject through the analysis of heartbeat dynamics exclusively. The study includes thirty subjects presented with a set of standardized images gathered from the international affective picture system, alternating levels of arousal and valence. Due to the intrinsic nonlinearity and nonstationarity of the RR interval series, a specific point-process model was devised for instantaneous identification considering autoregressive nonlinearities up to the third-order according to the Wiener-Volterra representation, thus tracking very fast stimulus-response changes. Features from the instantaneous spectrum and bispectrum, as well as the dominant Lyapunov exponent, were extracted and considered as input features to a support vector machine for classification. Results, estimating emotions each 10 seconds, achieve an overall accuracy in recognizing four emotional states based on the circumplex model of affect of 79.29%, with 79.15% on the valence axis, and 83.55% on the arousal axis.
NASA Astrophysics Data System (ADS)
Csatho, B. M.; Schenk, A. F.; Babonis, G. S.; van den Broeke, M. R.; Kuipers Munneke, P.; van der Veen, C. J.; Khan, S. A.; Porter, D. F.
2016-12-01
This study presents a new, comprehensive reconstruction of Greenland Ice Sheet elevation changes, generated using the Surface Elevation And Change detection (SERAC) approach. 35-year long elevation-change time series (1980-2015) were obtained at more than 150,000 locations from observations acquired by NASA's airborne and spaceborne laser altimeters (ATM, LVIS, ICESat), PROMICE laser altimetry data (2007-2011) and a DEM covering the ice sheet margin derived from stereo aerial photographs (1970s-80s). After removing the effect of Glacial Isostatic Adjustment (GIA) and the elastic crustal response to changes in ice loading, the time series were partitioned into changes due to surface processes and ice dynamics and then converted into mass change histories. Using gridded products, we examined ice sheet elevation, and mass change patterns, and compared them with other estimates at different scales from individual outlet glaciers through large drainage basins, on to the entire ice sheet. Both the SERAC time series and the grids derived from these time series revealed significant spatial and temporal variations of dynamic mass loss and widespread intermittent thinning, indicating the complexity of ice sheet response to climate forcing. To investigate the regional and local controls of ice dynamics, we examined thickness change time series near outlet glacier grounding lines. Changes on most outlet glaciers were consistent with one or more episodes of dynamic thinning that propagates upstream from the glacier terminus. The spatial pattern of the onset, duration, and termination of these dynamic thinning events suggest a regional control, such as warming ocean and air temperatures. However, the intricate spatiotemporal pattern of dynamic thickness change suggests that, regardless of the forcing responsible for initial glacier acceleration and thinning, the response of individual glaciers is modulated by local conditions. We use statistical methods, such as principal component analysis and multivariate regression to analyze the dynamic ice-thickness change time series derived by SERAC and to investigate the primary forcings and controls on outlet glacier changes.
Molenaar, Peter C M
2017-01-01
Equivalences of two classes of dynamic models for weakly stationary multivariate time series are discussed: dynamic factor models and autoregressive models. It is shown that exploratory dynamic factor models can be rotated, yielding an infinite set of equivalent solutions for any observed series. It also is shown that dynamic factor models with lagged factor loadings are not equivalent to the currently popular state-space models, and that restriction of attention to the latter type of models may yield invalid results. The known equivalent vector autoregressive model types, standard and structural, are given a new interpretation in which they are conceived of as the extremes of an innovating type of hybrid vector autoregressive models. It is shown that consideration of hybrid models solves many problems, in particular with Granger causality testing.
Selecting and applying indicators of ecosystem collapse for risk assessments.
Rowland, Jessica A; Nicholson, Emily; Murray, Nicholas J; Keith, David A; Lester, Rebecca E; Bland, Lucie M
2018-03-12
Ongoing ecosystem degradation and transformation are key threats to biodiversity. Measuring ecosystem change towards collapse relies on monitoring indicators that quantify key ecological processes. Yet little guidance is available on selecting and implementing indicators for ecosystem risk assessment. Here, we reviewed indicator use in ecological studies of decline towards collapse in marine pelagic and temperate forest ecosystems. We evaluated the use of indicator selection methods, indicator types (geographic distribution, abiotic, biotic), methods of assessing multiple indicators, and temporal quality of time series. We compared these ecological studies to risk assessments in the International Union for the Conservation of Nature Red List of Ecosystems (RLE), where indicators are used to estimate ecosystem collapse risk. We found that ecological studies and RLE assessments rarely reported how indicators were selected, particularly in terrestrial ecosystems. Few ecological studies and RLE assessments quantified ecosystem change with all three indicator types, and indicators types used varied between marine and terrestrial ecosystem. Several studies used indices or multivariate analyses to assess multiple indicators simultaneously, but RLE assessments did not, as RLE guidelines advise against them. Most studies and RLE assessments used time series spanning at least 30 years, increasing the chance of reliably detecting change. Limited use of indicator selection protocols and infrequent use of all three indicator types may hamper the ability to accurately detect changes. To improve the value of risk assessments for informing policy and management, we recommend using: (i) explicit protocols, including conceptual models, to identify and select indicators; (ii) a range of indicators spanning distributional, abiotic and biotic features; (iii) indices and multivariate analyses with extreme care until guidelines are developed; (iv) time series with sufficient data to increase ability to accurately diagnose directional change; (v) data from multiple sources to support assessments; and (vi) explicitly reporting steps in the assessment process. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Application of reiteration of Hankel singular value decomposition in quality control
NASA Astrophysics Data System (ADS)
Staniszewski, Michał; Skorupa, Agnieszka; Boguszewicz, Łukasz; Michalczuk, Agnieszka; Wereszczyński, Kamil; Wicher, Magdalena; Konopka, Marek; Sokół, Maria; Polański, Andrzej
2017-07-01
Medical centres are obliged to store past medical records, including the results of quality assurance (QA) tests of the medical equipment, which is especially useful in checking reproducibility of medical devices and procedures. Analysis of multivariate time series is an important part of quality control of NMR data. In this work we proposean anomaly detection tool based on Reiteration of Hankel Singular Value Decomposition method. The presented method was compared with external software and authors obtained comparable results.
Lie, Octavian V; van Mierlo, Pieter
2017-01-01
The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (<60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach.
Michalareas, George; Schoffelen, Jan-Mathijs; Paterson, Gavin; Gross, Joachim
2013-01-01
Abstract In this work, we investigate the feasibility to estimating causal interactions between brain regions based on multivariate autoregressive models (MAR models) fitted to magnetoencephalographic (MEG) sensor measurements. We first demonstrate the theoretical feasibility of estimating source level causal interactions after projection of the sensor-level model coefficients onto the locations of the neural sources. Next, we show with simulated MEG data that causality, as measured by partial directed coherence (PDC), can be correctly reconstructed if the locations of the interacting brain areas are known. We further demonstrate, if a very large number of brain voxels is considered as potential activation sources, that PDC as a measure to reconstruct causal interactions is less accurate. In such case the MAR model coefficients alone contain meaningful causality information. The proposed method overcomes the problems of model nonrobustness and large computation times encountered during causality analysis by existing methods. These methods first project MEG sensor time-series onto a large number of brain locations after which the MAR model is built on this large number of source-level time-series. Instead, through this work, we demonstrate that by building the MAR model on the sensor-level and then projecting only the MAR coefficients in source space, the true casual pathways are recovered even when a very large number of locations are considered as sources. The main contribution of this work is that by this methodology entire brain causality maps can be efficiently derived without any a priori selection of regions of interest. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc. PMID:22328419
NASA Astrophysics Data System (ADS)
Serinaldi, Francesco; Kilsby, Chris G.
2013-06-01
The information contained in hyetographs and hydrographs is often synthesized by using key properties such as the peak or maximum value Xp, volume V, duration D, and average intensity I. These variables play a fundamental role in hydrologic engineering as they are used, for instance, to define design hyetographs and hydrographs as well as to model and simulate the rainfall and streamflow processes. Given their inherent variability and the empirical evidence of the presence of a significant degree of association, such quantities have been studied as correlated random variables suitable to be modeled by multivariate joint distribution functions. The advent of copulas in geosciences simplified the inference procedures allowing for splitting the analysis of the marginal distributions and the study of the so-called dependence structure or copula. However, the attention paid to the modeling task has overlooked a more thorough study of the true nature and origin of the relationships that link Xp,V,D, and I. In this study, we apply a set of ad hoc bootstrap algorithms to investigate these aspects by analyzing the hyetographs and hydrographs extracted from 282 daily rainfall series from central eastern Europe, three 5 min rainfall series from central Italy, 80 daily streamflow series from the continental United States, and two sets of 200 simulated universal multifractal time series. Our results show that all the pairwise dependence structures between Xp,V,D, and I exhibit some key properties that can be reproduced by simple bootstrap algorithms that rely on a standard univariate resampling without resort to multivariate techniques. Therefore, the strong similarities between the observed dependence structures and the agreement between the observed and bootstrap samples suggest the existence of a numerical generating mechanism based on the superposition of the effects of sampling data at finite time steps and the process of summing realizations of independent random variables over random durations. We also show that the pairwise dependence structures are weakly dependent on the internal patterns of the hyetographs and hydrographs, meaning that the temporal evolution of the rainfall and runoff events marginally influences the mutual relationships of Xp,V,D, and I. Finally, our findings point out that subtle and often overlooked deterministic relationships between the properties of the event hyetographs and hydrographs exist. Confusing these relationships with genuine stochastic relationships can lead to an incorrect application of multivariate distributions and copulas and to misleading results.
LSST Astroinformatics And Astrostatistics: Data-oriented Astronomical Research
NASA Astrophysics Data System (ADS)
Borne, Kirk D.; Stassun, K.; Brunner, R. J.; Djorgovski, S. G.; Graham, M.; Hakkila, J.; Mahabal, A.; Paegert, M.; Pesenson, M.; Ptak, A.; Scargle, J.; Informatics, LSST; Statistics Team
2011-01-01
The LSST Informatics and Statistics Science Collaboration (ISSC) focuses on research and scientific discovery challenges posed by the very large and complex data collection that LSST will generate. Application areas include astroinformatics, machine learning, data mining, astrostatistics, visualization, scientific data semantics, time series analysis, and advanced signal processing. Research problems to be addressed with these methodologies include transient event characterization and classification, rare class discovery, correlation mining, outlier/anomaly/surprise detection, improved estimators (e.g., for photometric redshift or early onset supernova classification), exploration of highly dimensional (multivariate) data catalogs, and more. We present sample science results from these data-oriented approaches to large-data astronomical research. We present results from LSST ISSC team members, including the EB (Eclipsing Binary) Factory, the environmental variations in the fundamental plane of elliptical galaxies, and outlier detection in multivariate catalogs.
Structural changes in cross-border liabilities: A multidimensional approach
NASA Astrophysics Data System (ADS)
Araújo, Tanya; Spelta, Alessandro
2014-01-01
We study the international interbank market through a geometric analysis of empirical data. The geometric analysis of the time series of cross-country liabilities shows that the systematic information of the interbank international market is contained in a space of small dimension. Geometric spaces of financial relations across countries are developed, for which the space volume, multivariate skewness and multivariate kurtosis are computed. The behavior of these coefficients reveals an important modification acting in the financial linkages since 1997 and allows us to relate the shape of the geometric space that emerges in recent years to the globally turbulent period that has characterized financial systems since the late 1990s. Here we show that, besides a persistent decrease in the volume of the geometric space since 1997, the observation of a generalized increase in the values of the multivariate skewness and kurtosis sheds some light on the behavior of cross-border interdependencies during periods of financial crises. This was found to occur in such a systematic fashion, that these coefficients may be used as a proxy for systemic risk.
NASA Technical Reports Server (NTRS)
Feiveson, Alan H.; Fiedler, James; Lee, Stuart M. M.; Westby, Christian M.; Stenger, Michael B.; Platts, Steven H.
2014-01-01
Orthostatic Intolerance (OI) is the propensity to develop symptoms of fainting during upright standing. OI is associated with changes in heart rate, blood pressure and other measures of cardiac function. Problem: NASA astronauts have shown increased susceptibility to OI on return from space missions. Current methods for counteracting OI in astronauts include fluid loading and the use of compression garments. Multivariate trajectory spread is greater as OI increases. Pairwise comparisons at the same time within subjects allows incorporation of pass/fail outcomes. Path length, convex hull area, and covariance matrix determinant do well as statistics to summarize this spread Missing data problems Time series analysis need many more time points per OTT session treatment of trend? how incorporate survival information?
Hu, Yanzhu; Ai, Xinbo
2016-01-01
Complex network methodology is very useful for complex system explorer. However, the relationships among variables in complex system are usually not clear. Therefore, inferring association networks among variables from their observed data has been a popular research topic. We propose a synthetic method, named small-shuffle partial symbolic transfer entropy spectrum (SSPSTES), for inferring association network from multivariate time series. The method synthesizes surrogate data, partial symbolic transfer entropy (PSTE) and Granger causality. A proper threshold selection is crucial for common correlation identification methods and it is not easy for users. The proposed method can not only identify the strong correlation without selecting a threshold but also has the ability of correlation quantification, direction identification and temporal relation identification. The method can be divided into three layers, i.e. data layer, model layer and network layer. In the model layer, the method identifies all the possible pair-wise correlation. In the network layer, we introduce a filter algorithm to remove the indirect weak correlation and retain strong correlation. Finally, we build a weighted adjacency matrix, the value of each entry representing the correlation level between pair-wise variables, and then get the weighted directed association network. Two numerical simulated data from linear system and nonlinear system are illustrated to show the steps and performance of the proposed approach. The ability of the proposed method is approved by an application finally. PMID:27832153
Biogeochemical Response to Mesoscale Physical Forcing in the California Current System
NASA Technical Reports Server (NTRS)
Niiler, Pearn P.; Letelier, Ricardo; Moisan, John R.; Marra, John A. (Technical Monitor)
2001-01-01
In the first part of the project, we investigated the local response of the coastal ocean ecosystems (changes in chlorophyll, concentration and chlorophyll, fluorescence quantum yield) to physical forcing by developing and deploying Autonomous Drifting Ocean Stations (ADOS) within several mesoscale features along the U.S. west coast. Also, we compared the temporal and spatial variability registered by sensors mounted in the drifters to that registered by the sensors mounted in the satellites in order to assess the scales of variability that are not resolved by the ocean color satellite. The second part of the project used the existing WOCE SVP Surface Lagrangian drifters to track individual water parcels through time. The individual drifter tracks were used to generate multivariate time series by interpolating/extracting the biological and physical data fields retrieved by remote sensors (ocean color, SST, wind speed and direction, wind stress curl, and sea level topography). The individual time series of the physical data (AVHRR, TOPEX, NCEP) were analyzed against the ocean color (SeaWiFS) time-series to determine the time scale of biological response to the physical forcing. The results from this part of the research is being used to compare the decorrelation scales of chlorophyll from a Lagrangian and Eulerian framework. The results from both parts of this research augmented the necessary time series data needed to investigate the interactions between the ocean mesoscale features, wind, and the biogeochemical processes. Using the historical Lagrangian data sets, we have completed a comparison of the decorrelation scales in both the Eulerian and Lagrangian reference frame for the SeaWiFS data set. We are continuing to investigate how these results might be used in objective mapping efforts.
Enhancing e-waste estimates: improving data quality by multivariate Input-Output Analysis.
Wang, Feng; Huisman, Jaco; Stevels, Ab; Baldé, Cornelis Peter
2013-11-01
Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input-Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Cohen, Michael X
2017-09-27
The number of simultaneously recorded electrodes in neuroscience is steadily increasing, providing new opportunities for understanding brain function, but also new challenges for appropriately dealing with the increase in dimensionality. Multivariate source separation analysis methods have been particularly effective at improving signal-to-noise ratio while reducing the dimensionality of the data and are widely used for cleaning, classifying and source-localizing multichannel neural time series data. Most source separation methods produce a spatial component (that is, a weighted combination of channels to produce one time series); here, this is extended to apply source separation to a time series, with the idea of obtaining a weighted combination of successive time points, such that the weights are optimized to satisfy some criteria. This is achieved via a two-stage source separation procedure, in which an optimal spatial filter is first constructed and then its optimal temporal basis function is computed. This second stage is achieved with a time-delay-embedding matrix, in which additional rows of a matrix are created from time-delayed versions of existing rows. The optimal spatial and temporal weights can be obtained by solving a generalized eigendecomposition of covariance matrices. The method is demonstrated in simulated data and in an empirical electroencephalogram study on theta-band activity during response conflict. Spatiotemporal source separation has several advantages, including defining empirical filters without the need to apply sinusoidal narrowband filters. © 2017 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
A chaotic model for the epidemic of Ebola virus disease in West Africa (2013-2016)
NASA Astrophysics Data System (ADS)
Mangiarotti, Sylvain; Peyre, Marisa; Huc, Mireille
2016-11-01
An epidemic of Ebola Virus Disease (EVD) broke out in Guinea in December 2013. It was only identified in March 2014 while it had already spread out in Liberia and Sierra Leone. The spill over of the disease became uncontrollable and the epidemic could not be stopped before 2016. The time evolution of this epidemic is revisited here with the global modeling technique which was designed to obtain the deterministic models from single time series. A generalized formulation of this technique for multivariate time series is introduced. It is applied to the epidemic of EVD in West Africa focusing on the period between March 2014 and January 2015, that is, before any detected signs of weakening. Data gathered by the World Health Organization, based on the official publications of the Ministries of Health of the three main countries involved in this epidemic, are considered in our analysis. Two observed time series are used: the daily numbers of infections and deaths. A four-dimensional model producing a very complex dynamical behavior is obtained. The model is tested in order to investigate its skills and drawbacks. Our global analysis clearly helps to distinguish three main stages during the epidemic. A characterization of the obtained attractor is also performed. In particular, the topology of the chaotic attractor is analyzed and a skeleton is obtained for its structure.
Detecting dynamic causal inference in nonlinear two-phase fracture flow
NASA Astrophysics Data System (ADS)
Faybishenko, Boris
2017-08-01
Identifying dynamic causal inference involved in flow and transport processes in complex fractured-porous media is generally a challenging task, because nonlinear and chaotic variables may be positively coupled or correlated for some periods of time, but can then become spontaneously decoupled or non-correlated. In his 2002 paper (Faybishenko, 2002), the author performed a nonlinear dynamical and chaotic analysis of time-series data obtained from the fracture flow experiment conducted by Persoff and Pruess (1995), and, based on the visual examination of time series data, hypothesized that the observed pressure oscillations at both inlet and outlet edges of the fracture result from a superposition of both forward and return waves of pressure propagation through the fracture. In the current paper, the author explores an application of a combination of methods for detecting nonlinear chaotic dynamics behavior along with the multivariate Granger Causality (G-causality) time series test. Based on the G-causality test, the author infers that his hypothesis is correct, and presents a causation loop diagram of the spatial-temporal distribution of gas, liquid, and capillary pressures measured at the inlet and outlet of the fracture. The causal modeling approach can be used for the analysis of other hydrological processes, for example, infiltration and pumping tests in heterogeneous subsurface media, and climatic processes, for example, to find correlations between various meteorological parameters, such as temperature, solar radiation, barometric pressure, etc.
NASA Astrophysics Data System (ADS)
Byakatonda, Jimmy; Parida, B. P.; Kenabatho, Piet K.; Moalafhi, D. B.
2018-03-01
Arid and semi-arid environments have been identified with locations prone to impacts of climate variability and change. Investigating long-term trends is one way of tracing climate change impacts. This study investigates variability through annual and seasonal meteorological time series. Possible inhomogeneities and years of intervention are analysed using four absolute homogeneity tests. Trends in the climatic variables were determined using Mann-Kendall and Sen's Slope estimator statistics. Association of El Niño Southern Oscillation (ENSO) with local climate is also investigated through multivariate analysis. Results from the study show that rainfall time series are fully homogeneous with 78.6 and 50% of the stations for maximum and minimum temperature, respectively, showing homogeneity. Trends also indicate a general decrease of 5.8, 7.4 and 18.1% in annual, summer and winter rainfall, respectively. Warming trends are observed in annual and winter temperature at 0.3 and 1.5% for maximum temperature and 1.7 and 6.5% for minimum temperature, respectively. Rainfall reported a positive correlation with Southern Oscillation Index (SOI) and at the same time negative association with Sea Surface Temperatures (SSTs). Strong relationships between SSTs and maximum temperature are observed during the El Niño and La Niña years. These study findings could facilitate planning and management of agricultural and water resources in Botswana.
NASA Astrophysics Data System (ADS)
Rokita, Pawel
Classical portfolio diversification methods do not take account of any dependence between extreme returns (losses). Many researchers provide, however, some empirical evidence for various assets that extreme-losses co-occur. If the co-occurrence is frequent enough to be statistically significant, it may seriously influence portfolio risk. Such effects may result from a few different properties of financial time series, like for instance: (1) extreme dependence in a (long-term) unconditional distribution, (2) extreme dependence in subsequent conditional distributions, (3) time-varying conditional covariance, (4) time-varying (long-term) unconditional covariance, (5) market contagion. Moreover, a mix of these properties may be present in return time series. Modeling each of them requires different approaches. It seams reasonable to investigate whether distinguishing between the properties is highly significant for portfolio risk measurement. If it is, identifying the effect responsible for high loss co-occurrence would be of a great importance. If it is not, the best solution would be selecting the easiest-to-apply model. This article concentrates on two of the aforementioned properties: extreme dependence (in a long-term unconditional distribution) and time-varying conditional covariance.
Detection of bifurcations in noisy coupled systems from multiple time series
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williamson, Mark S., E-mail: m.s.williamson@exeter.ac.uk; Lenton, Timothy M.
We generalize a method of detecting an approaching bifurcation in a time series of a noisy system from the special case of one dynamical variable to multiple dynamical variables. For a system described by a stochastic differential equation consisting of an autonomous deterministic part with one dynamical variable and an additive white noise term, small perturbations away from the system's fixed point will decay slower the closer the system is to a bifurcation. This phenomenon is known as critical slowing down and all such systems exhibit this decay-type behaviour. However, when the deterministic part has multiple coupled dynamical variables, themore » possible dynamics can be much richer, exhibiting oscillatory and chaotic behaviour. In our generalization to the multi-variable case, we find additional indicators to decay rate, such as frequency of oscillation. In the case of approaching a homoclinic bifurcation, there is no change in decay rate but there is a decrease in frequency of oscillations. The expanded method therefore adds extra tools to help detect and classify approaching bifurcations given multiple time series, where the underlying dynamics are not fully known. Our generalisation also allows bifurcation detection to be applied spatially if one treats each spatial location as a new dynamical variable. One may then determine the unstable spatial mode(s). This is also something that has not been possible with the single variable method. The method is applicable to any set of time series regardless of its origin, but may be particularly useful when anticipating abrupt changes in the multi-dimensional climate system.« less
Small-world bias of correlation networks: From brain to climate
NASA Astrophysics Data System (ADS)
Hlinka, Jaroslav; Hartman, David; Jajcay, Nikola; Tomeček, David; Tintěra, Jaroslav; Paluš, Milan
2017-03-01
Complex systems are commonly characterized by the properties of their graph representation. Dynamical complex systems are then typically represented by a graph of temporal dependencies between time series of state variables of their subunits. It has been shown recently that graphs constructed in this way tend to have relatively clustered structure, potentially leading to spurious detection of small-world properties even in the case of systems with no or randomly distributed true interactions. However, the strength of this bias depends heavily on a range of parameters and its relevance for real-world data has not yet been established. In this work, we assess the relevance of the bias using two examples of multivariate time series recorded in natural complex systems. The first is the time series of local brain activity as measured by functional magnetic resonance imaging in resting healthy human subjects, and the second is the time series of average monthly surface air temperature coming from a large reanalysis of climatological data over the period 1948-2012. In both cases, the clustering in the thresholded correlation graph is substantially higher compared with a realization of a density-matched random graph, while the shortest paths are relatively short, showing thus distinguishing features of small-world structure. However, comparable or even stronger small-world properties were reproduced in correlation graphs of model processes with randomly scrambled interconnections. This suggests that the small-world properties of the correlation matrices of these real-world systems indeed do not reflect genuinely the properties of the underlying interaction structure, but rather result from the inherent properties of correlation matrix.
Stochastic Simulation and Forecast of Hydrologic Time Series Based on Probabilistic Chaos Expansion
NASA Astrophysics Data System (ADS)
Li, Z.; Ghaith, M.
2017-12-01
Hydrological processes are characterized by many complex features, such as nonlinearity, dynamics and uncertainty. How to quantify and address such complexities and uncertainties has been a challenging task for water engineers and managers for decades. To support robust uncertainty analysis, an innovative approach for the stochastic simulation and forecast of hydrologic time series is developed is this study. Probabilistic Chaos Expansions (PCEs) are established through probabilistic collocation to tackle uncertainties associated with the parameters of traditional hydrological models. The uncertainties are quantified in model outputs as Hermite polynomials with regard to standard normal random variables. Sequentially, multivariate analysis techniques are used to analyze the complex nonlinear relationships between meteorological inputs (e.g., temperature, precipitation, evapotranspiration, etc.) and the coefficients of the Hermite polynomials. With the established relationships between model inputs and PCE coefficients, forecasts of hydrologic time series can be generated and the uncertainties in the future time series can be further tackled. The proposed approach is demonstrated using a case study in China and is compared to a traditional stochastic simulation technique, the Markov-Chain Monte-Carlo (MCMC) method. Results show that the proposed approach can serve as a reliable proxy to complicated hydrological models. It can provide probabilistic forecasting in a more computationally efficient manner, compared to the traditional MCMC method. This work provides technical support for addressing uncertainties associated with hydrological modeling and for enhancing the reliability of hydrological modeling results. Applications of the developed approach can be extended to many other complicated geophysical and environmental modeling systems to support the associated uncertainty quantification and risk analysis.
Detection of bifurcations in noisy coupled systems from multiple time series
NASA Astrophysics Data System (ADS)
Williamson, Mark S.; Lenton, Timothy M.
2015-03-01
We generalize a method of detecting an approaching bifurcation in a time series of a noisy system from the special case of one dynamical variable to multiple dynamical variables. For a system described by a stochastic differential equation consisting of an autonomous deterministic part with one dynamical variable and an additive white noise term, small perturbations away from the system's fixed point will decay slower the closer the system is to a bifurcation. This phenomenon is known as critical slowing down and all such systems exhibit this decay-type behaviour. However, when the deterministic part has multiple coupled dynamical variables, the possible dynamics can be much richer, exhibiting oscillatory and chaotic behaviour. In our generalization to the multi-variable case, we find additional indicators to decay rate, such as frequency of oscillation. In the case of approaching a homoclinic bifurcation, there is no change in decay rate but there is a decrease in frequency of oscillations. The expanded method therefore adds extra tools to help detect and classify approaching bifurcations given multiple time series, where the underlying dynamics are not fully known. Our generalisation also allows bifurcation detection to be applied spatially if one treats each spatial location as a new dynamical variable. One may then determine the unstable spatial mode(s). This is also something that has not been possible with the single variable method. The method is applicable to any set of time series regardless of its origin, but may be particularly useful when anticipating abrupt changes in the multi-dimensional climate system.
Small-world bias of correlation networks: From brain to climate.
Hlinka, Jaroslav; Hartman, David; Jajcay, Nikola; Tomeček, David; Tintěra, Jaroslav; Paluš, Milan
2017-03-01
Complex systems are commonly characterized by the properties of their graph representation. Dynamical complex systems are then typically represented by a graph of temporal dependencies between time series of state variables of their subunits. It has been shown recently that graphs constructed in this way tend to have relatively clustered structure, potentially leading to spurious detection of small-world properties even in the case of systems with no or randomly distributed true interactions. However, the strength of this bias depends heavily on a range of parameters and its relevance for real-world data has not yet been established. In this work, we assess the relevance of the bias using two examples of multivariate time series recorded in natural complex systems. The first is the time series of local brain activity as measured by functional magnetic resonance imaging in resting healthy human subjects, and the second is the time series of average monthly surface air temperature coming from a large reanalysis of climatological data over the period 1948-2012. In both cases, the clustering in the thresholded correlation graph is substantially higher compared with a realization of a density-matched random graph, while the shortest paths are relatively short, showing thus distinguishing features of small-world structure. However, comparable or even stronger small-world properties were reproduced in correlation graphs of model processes with randomly scrambled interconnections. This suggests that the small-world properties of the correlation matrices of these real-world systems indeed do not reflect genuinely the properties of the underlying interaction structure, but rather result from the inherent properties of correlation matrix.
Prolonged instability prior to a regime shift
Spanbauer, Trisha; Allen, Craig R.; Angeler, David G.; Eason, Tarsha; Fritz, Sherilyn C.; Garmestani, Ahjond S.; Nash, Kirsty L.; Stone, Jeffery R.
2014-01-01
Regime shifts are generally defined as the point of ‘abrupt’ change in the state of a system. However, a seemingly abrupt transition can be the product of a system reorganization that has been ongoing much longer than is evident in statistical analysis of a single component of the system. Using both univariate and multivariate statistical methods, we tested a long-term high-resolution paleoecological dataset with a known change in species assemblage for a regime shift. Analysis of this dataset with Fisher Information and multivariate time series modeling showed that there was a∼2000 year period of instability prior to the regime shift. This period of instability and the subsequent regime shift coincide with regional climate change, indicating that the system is undergoing extrinsic forcing. Paleoecological records offer a unique opportunity to test tools for the detection of thresholds and stable-states, and thus to examine the long-term stability of ecosystems over periods of multiple millennia.
Tracking the time-varying cortical connectivity patterns by adaptive multivariate estimators.
Astolfi, L; Cincotti, F; Mattia, D; De Vico Fallani, F; Tocci, A; Colosimo, A; Salinari, S; Marciani, M G; Hesse, W; Witte, H; Ursino, M; Zavaglia, M; Babiloni, F
2008-03-01
The directed transfer function (DTF) and the partial directed coherence (PDC) are frequency-domain estimators that are able to describe interactions between cortical areas in terms of the concept of Granger causality. However, the classical estimation of these methods is based on the multivariate autoregressive modelling (MVAR) of time series, which requires the stationarity of the signals. In this way, transient pathways of information transfer remains hidden. The objective of this study is to test a time-varying multivariate method for the estimation of rapidly changing connectivity relationships between cortical areas of the human brain, based on DTF/PDC and on the use of adaptive MVAR modelling (AMVAR) and to apply it to a set of real high resolution EEG data. This approach will allow the observation of rapidly changing influences between the cortical areas during the execution of a task. The simulation results indicated that time-varying DTF and PDC are able to estimate correctly the imposed connectivity patterns under reasonable operative conditions of signal-to-noise ratio (SNR) ad number of trials. An SNR of five and a number of trials of at least 20 provide a good accuracy in the estimation. After testing the method by the simulation study, we provide an application to the cortical estimations obtained from high resolution EEG data recorded from a group of healthy subject during a combined foot-lips movement and present the time-varying connectivity patterns resulting from the application of both DTF and PDC. Two different cortical networks were detected with the proposed methods, one constant across the task and the other evolving during the preparation of the joint movement.
A new algorithm for automatic Outlier Detection in GPS Time Series
NASA Astrophysics Data System (ADS)
Cannavo', Flavio; Mattia, Mario; Rossi, Massimo; Palano, Mimmo; Bruno, Valentina
2010-05-01
Nowadays continuous GPS time series are considered a crucial product of GPS permanent networks, useful in many geo-science fields, such as active tectonics, seismology, crustal deformation and volcano monitoring (Altamimi et al. 2002, Elósegui et al. 2006, Aloisi et al. 2009). Although the GPS data elaboration software has increased in reliability, the time series are still affected by different kind of noise, from the intrinsic noise (e.g. thropospheric delay) to the un-modeled noise (e.g. cycle slips, satellite faults, parameters changing). Typically GPS Time Series present characteristic noise that is a linear combination of white noise and correlated colored noise, and this characteristic is fractal in the sense that is evident for every considered time scale or sampling rate. The un-modeled noise sources result in spikes, outliers and steps. These kind of errors can appreciably influence the estimation of velocities of the monitored sites. The outlier detection in generic time series is a widely treated problem in literature (Wei, 2005), while is not fully developed for the specific kind of GPS series. We propose a robust automatic procedure for cleaning the GPS time series from the outliers and, especially for long daily series, steps due to strong seismic or volcanic events or merely instrumentation changing such as antenna and receiver upgrades. The procedure is basically divided in two steps: a first step for the colored noise reduction and a second step for outlier detection through adaptive series segmentation. Both algorithms present novel ideas and are nearly unsupervised. In particular, we propose an algorithm to estimate an autoregressive model for colored noise in GPS time series in order to subtract the effect of non Gaussian noise on the series. This step is useful for the subsequent step (i.e. adaptive segmentation) which requires the hypothesis of Gaussian noise. The proposed algorithms are tested in a benchmark case study and the results confirm that the algorithms are effective and reasonable. Bibliography - Aloisi M., A. Bonaccorso, F. Cannavò, S. Gambino, M. Mattia, G. Puglisi, E. Boschi, A new dyke intrusion style for the Mount Etna May 2008 eruption modelled through continuous tilt and GPS data, Terra Nova, Volume 21 Issue 4 , Pages 316 - 321, doi: 10.1111/j.1365-3121.2009.00889.x (August 2009) - Altamimi Z., Sillard P., Boucher C., ITRF2000: A new release of the International Terrestrial Reference frame for earth science applications, J Geophys Res-Solid Earth, 107 (B10): art. no.-2214, (Oct 2002) - Elósegui, P., J. L. Davis, D. Oberlander, R. Baena, and G. Ekström , Accuracy of high-rate GPS for seismology, Geophys. Res. Lett., 33, L11308, doi:10.1029/2006GL026065 (2006) - Wei W. S., Time Series Analysis: Univariate and Multivariate Methods, Addison Wesley (2 edition), ISBN-10: 0321322169 (July, 2005)
Bookstaver, P Brandon; Foster, Jenna L; Lu, Z Kevin; Mann, Joshua R; Ambrose, Chelsea; Grant, Amy; Burgess, Stephanie
2016-01-01
To investigate the hepatitis B virus (HBV) seroconversion rate among health sciences students. The study included pharmacy, doctor of nursing, and medical students over 18 years of age enrolled at the University of South Carolina between 2007 and 2011. The primary end point was HBV seroconversion rates among students at the initial reporting period. Seroconversion was defined as hepatitis B surface antibody (anti-HBs) level greater than or equal to 10 mIU/mL. Multivariate regression analysis was used to determine predictive factors of seroconversion. Of 777 records, data were available for 709 students. An 83.9% seroconversion rate was observed after a mean of 10 years between vaccine receipt and anti-HBs evaluation. Students with incomplete HBV vaccine series and longer time between initial series and evaluation were less likely to exhibit antibody response. These data highlight the importance of assessment and documentation of HBV vaccination series among health sciences students prior to direct patient care activities.
Multivariate statistical analysis of wildfires in Portugal
NASA Astrophysics Data System (ADS)
Costa, Ricardo; Caramelo, Liliana; Pereira, Mário
2013-04-01
Several studies demonstrate that wildfires in Portugal present high temporal and spatial variability as well as cluster behavior (Pereira et al., 2005, 2011). This study aims to contribute to the characterization of the fire regime in Portugal with the multivariate statistical analysis of the time series of number of fires and area burned in Portugal during the 1980 - 2009 period. The data used in the analysis is an extended version of the Rural Fire Portuguese Database (PRFD) (Pereira et al, 2011), provided by the National Forest Authority (Autoridade Florestal Nacional, AFN), the Portuguese Forest Service, which includes information for more than 500,000 fire records. There are many multiple advanced techniques for examining the relationships among multiple time series at the same time (e.g., canonical correlation analysis, principal components analysis, factor analysis, path analysis, multiple analyses of variance, clustering systems). This study compares and discusses the results obtained with these different techniques. Pereira, M.G., Trigo, R.M., DaCamara, C.C., Pereira, J.M.C., Leite, S.M., 2005: "Synoptic patterns associated with large summer forest fires in Portugal". Agricultural and Forest Meteorology. 129, 11-25. Pereira, M. G., Malamud, B. D., Trigo, R. M., and Alves, P. I.: The history and characteristics of the 1980-2005 Portuguese rural fire database, Nat. Hazards Earth Syst. Sci., 11, 3343-3358, doi:10.5194/nhess-11-3343-2011, 2011 This work is supported by European Union Funds (FEDER/COMPETE - Operational Competitiveness Programme) and by national funds (FCT - Portuguese Foundation for Science and Technology) under the project FCOMP-01-0124-FEDER-022692, the project FLAIR (PTDC/AAC-AMB/104702/2008) and the EU 7th Framework Program through FUME (contract number 243888).
Using chaotic forcing to detect damage in a structure
Moniz, L.; Nichols, J.; Trickey, S.; Seaver, M.; Pecora, D.; Pecora, L.
2005-01-01
In this work we develop a numerical test for Holder continuity and apply it and another test for continuity to the difficult problem of detecting damage in structures. We subject a thin metal plate with incremental damage to the plate changes, its filtering properties, and therefore the phase space trajectories of the response chaotic excitation of various bandwidths. Damage to the plate changes its filtering properties and therefore the phase space of the response. Because the data are multivariate (the plate is instrumented with multiple sensors) we use a singular value decomposition of the set of the output time series to reduce the embedding dimension of the response time series. We use two geometric tests to compare an attractor reconstructed from data from an undamaged structure to that reconstructed from data from a damaged structure. These two tests translate to testing for both generalized and differentiable synchronization between responses. We show loss of synchronization of responses with damage to the structure. ?? 2005 American Institute of Physics.
Using chaotic forcing to detect damage in a structure.
Moniz, L.; Nichols, J.; Trickey, S.; Seaver, M.; Pecora, D.; Pecora, L.
2005-01-01
In this work we develop a numerical test for Holder continuity and apply it and another test for continuity to the difficult problem of detecting damage in structures. We subject a thin metal plate with incremental damage to the plate changes, its filtering properties, and therefore the phase space trajectories of the response chaotic excitation of various bandwidths. Damage to the plate changes its filtering properties and therefore the phase space of the response. Because the data are multivariate (the plate is instrumented with multiple sensors) we use a singular value decomposition of the set of the output time series to reduce the embedding dimension of the response time series. We use two geometric tests to compare an attractor reconstructed from data from an undamaged structure to that reconstructed from data from a damaged structure. These two tests translate to testing for both generalized and differentiable synchronization between responses. We show loss of synchronization of responses with damage to the structure.
Imputation of missing data in time series for air pollutants
NASA Astrophysics Data System (ADS)
Junger, W. L.; Ponce de Leon, A.
2015-02-01
Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.
Using "big data" to optimally model hydrology and water quality across expansive regions
Roehl, E.A.; Cook, J.B.; Conrads, P.A.
2009-01-01
This paper describes a new divide and conquer approach that leverages big environmental data, utilizing all available categorical and time-series data without subjectivity, to empirically model hydrologic and water-quality behaviors across expansive regions. The approach decomposes large, intractable problems into smaller ones that are optimally solved; decomposes complex signals into behavioral components that are easier to model with "sub- models"; and employs a sequence of numerically optimizing algorithms that include time-series clustering, nonlinear, multivariate sensitivity analysis and predictive modeling using multi-layer perceptron artificial neural networks, and classification for selecting the best sub-models to make predictions at new sites. This approach has many advantages over traditional modeling approaches, including being faster and less expensive, more comprehensive in its use of available data, and more accurate in representing a system's physical processes. This paper describes the application of the approach to model groundwater levels in Florida, stream temperatures across Western Oregon and Wisconsin, and water depths in the Florida Everglades. ?? 2009 ASCE.
A dynamic factor model of the evaluation of the financial crisis in Turkey.
Sezgin, F; Kinay, B
2010-01-01
Factor analysis has been widely used in economics and finance in situations where a relatively large number of variables are believed to be driven by few common causes of variation. Dynamic factor analysis (DFA) which is a combination of factor and time series analysis, involves autocorrelation matrices calculated from multivariate time series. Dynamic factor models were traditionally used to construct economic indicators, macroeconomic analysis, business cycles and forecasting. In recent years, dynamic factor models have become more popular in empirical macroeconomics. They have more advantages than other methods in various respects. Factor models can for instance cope with many variables without running into scarce degrees of freedom problems often faced in regression-based analysis. In this study, a model which determines the effect of the global crisis on Turkey is proposed. The main aim of the paper is to analyze how several macroeconomic quantities show an alteration before the evolution of the crisis and to decide if a crisis can be forecasted or not.
NASA Astrophysics Data System (ADS)
Cannon, Alex J.
2018-01-01
Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.
ERIC Educational Resources Information Center
Inbar-Furst, Hagit; Gumpel, Thomas P.
2015-01-01
Questionnaires were given to 392 elementary school teachers to examine help-seeking or help-avoidance in dealing with classroom behavioral problems. Scale validity was examined through a series of exploratory and confirmatory factor analyses. Using a series of multivariate regression analyses and structural equation modeling, we identified…
Causal diagrams and multivariate analysis III: confound it!
Jupiter, Daniel C
2015-01-01
This commentary concludes my series concerning inclusion of variables in multivariate analyses. We take up the issues of confounding and effect modification and summarize the work we have thus far done. Finally, we provide a rough algorithm to help guide us through the maze of possibilities that we have outlined. Copyright © 2015 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Lizier, Joseph T; Heinzle, Jakob; Horstmann, Annette; Haynes, John-Dylan; Prokopenko, Mikhail
2011-02-01
The human brain undertakes highly sophisticated information processing facilitated by the interaction between its sub-regions. We present a novel method for interregional connectivity analysis, using multivariate extensions to the mutual information and transfer entropy. The method allows us to identify the underlying directed information structure between brain regions, and how that structure changes according to behavioral conditions. This method is distinguished in using asymmetric, multivariate, information-theoretical analysis, which captures not only directional and non-linear relationships, but also collective interactions. Importantly, the method is able to estimate multivariate information measures with only relatively little data. We demonstrate the method to analyze functional magnetic resonance imaging time series to establish the directed information structure between brain regions involved in a visuo-motor tracking task. Importantly, this results in a tiered structure, with known movement planning regions driving visual and motor control regions. Also, we examine the changes in this structure as the difficulty of the tracking task is increased. We find that task difficulty modulates the coupling strength between regions of a cortical network involved in movement planning and between motor cortex and the cerebellum which is involved in the fine-tuning of motor control. It is likely these methods will find utility in identifying interregional structure (and experimentally induced changes in this structure) in other cognitive tasks and data modalities.
A Multivariate Granger Causality Concept towards Full Brain Functional Connectivity.
Schmidt, Christoph; Pester, Britta; Schmid-Hertel, Nicole; Witte, Herbert; Wismüller, Axel; Leistritz, Lutz
2016-01-01
Detecting changes of spatially high-resolution functional connectivity patterns in the brain is crucial for improving the fundamental understanding of brain function in both health and disease, yet still poses one of the biggest challenges in computational neuroscience. Currently, classical multivariate Granger Causality analyses of directed interactions between single process components in coupled systems are commonly restricted to spatially low- dimensional data, which requires a pre-selection or aggregation of time series as a preprocessing step. In this paper we propose a new fully multivariate Granger Causality approach with embedded dimension reduction that makes it possible to obtain a representation of functional connectivity for spatially high-dimensional data. The resulting functional connectivity networks may consist of several thousand vertices and thus contain more detailed information compared to connectivity networks obtained from approaches based on particular regions of interest. Our large scale Granger Causality approach is applied to synthetic and resting state fMRI data with a focus on how well network community structure, which represents a functional segmentation of the network, is preserved. It is demonstrated that a number of different community detection algorithms, which utilize a variety of algorithmic strategies and exploit topological features differently, reveal meaningful information on the underlying network module structure.
Effect of noise in principal component analysis with an application to ozone pollution
NASA Astrophysics Data System (ADS)
Tsakiri, Katerina G.
This thesis analyzes the effect of independent noise in principal components of k normally distributed random variables defined by a covariance matrix. We prove that the principal components as well as the canonical variate pairs determined from joint distribution of original sample affected by noise can be essentially different in comparison with those determined from the original sample. However when the differences between the eigenvalues of the original covariance matrix are sufficiently large compared to the level of the noise, the effect of noise in principal components and canonical variate pairs proved to be negligible. The theoretical results are supported by simulation study and examples. Moreover, we compare our results about the eigenvalues and eigenvectors in the two dimensional case with other models examined before. This theory can be applied in any field for the decomposition of the components in multivariate analysis. One application is the detection and prediction of the main atmospheric factor of ozone concentrations on the example of Albany, New York. Using daily ozone, solar radiation, temperature, wind speed and precipitation data, we determine the main atmospheric factor for the explanation and prediction of ozone concentrations. A methodology is described for the decomposition of the time series of ozone and other atmospheric variables into the global term component which describes the long term trend and the seasonal variations, and the synoptic scale component which describes the short term variations. By using the Canonical Correlation Analysis, we show that solar radiation is the only main factor between the atmospheric variables considered here for the explanation and prediction of the global and synoptic scale component of ozone. The global term components are modeled by a linear regression model, while the synoptic scale components by a vector autoregressive model and the Kalman filter. The coefficient of determination, R2, for the prediction of the synoptic scale ozone component was found to be the highest when we consider the synoptic scale component of the time series for solar radiation and temperature. KEY WORDS: multivariate analysis; principal component; canonical variate pairs; eigenvalue; eigenvector; ozone; solar radiation; spectral decomposition; Kalman filter; time series prediction
Daily Mean Temperature and Urolithiasis Presentation in Six Cities in Korea: Time-Series Analysis.
Chi, Byung Hoon; Chang, In Ho; Choi, Se Young; Suh, Dong Churl; Chang, Chong Won; Choi, Yun Jung; Lee, Seo Yeon
2017-06-01
Seasonal variation in urinary stone presentation is well described in the literature. However, previous studies have some limitations. To explore overall cumulative exposure-response and the heterogeneity in the relationships between daily meteorological factors and urolithiasis incidence in 6 major Korean cities, we analyzed data on 687,833 urolithiasis patients from 2009 to 2013 for 6 large cities in Korea: Seoul, Incheon, Daejeon, Gwangju, Daegu, and Busan. Using a time-series design and distributing lag nonlinear methods, we estimated the relative risk (RR) of mean daily urolithiasis incidence (MDUI) associated with mean daily meteorological factors, including the cumulative RR for a 20-day period. The estimated location-specific associations were then pooled using multivariate meta-regression models. A positive association was confirmed between MDUI and mean daily temperature (MDT), and a negative association was shown between MDUI and mean daily relative humidity (MDRH) in all cities. The lag effect was within 5 days. The multivariate Cochran Q test for heterogeneity at MDT was 12.35 (P = 0.136), and the related I² statistic accounted for 35.2% of the variability. Additionally, the Cochran Q test for heterogeneity and I² statistic at MDHR were 26.73 (P value = 0.148) and 24.7% of variability in the total group. Association was confirmed between daily temperature, relative humidity and urolithiasis incidence, and the differences in urolithiasis incidence might have been partially attributable to the different frequencies and the ranges in temperature and humidity between cities in Korea. © 2017 The Korean Academy of Medical Sciences.
NASA Astrophysics Data System (ADS)
Adarsh, S.; Reddy, M. Janga
2017-07-01
In this paper, the Hilbert-Huang transform (HHT) approach is used for the multiscale characterization of All India Summer Monsoon Rainfall (AISMR) time series and monsoon rainfall time series from five homogeneous regions in India. The study employs the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) for multiscale decomposition of monsoon rainfall in India and uses the Normalized Hilbert Transform and Direct Quadrature (NHT-DQ) scheme for the time-frequency characterization. The cross-correlation analysis between orthogonal modes of All India monthly monsoon rainfall time series and that of five climate indices such as Quasi Biennial Oscillation (QBO), El Niño Southern Oscillation (ENSO), Sunspot Number (SN), Atlantic Multi Decadal Oscillation (AMO), and Equatorial Indian Ocean Oscillation (EQUINOO) in the time domain showed that the links of different climate indices with monsoon rainfall are expressed well only for few low-frequency modes and for the trend component. Furthermore, this paper investigated the hydro-climatic teleconnection of ISMR in multiple time scales using the HHT-based running correlation analysis technique called time-dependent intrinsic correlation (TDIC). The results showed that both the strength and nature of association between different climate indices and ISMR vary with time scale. Stemming from this finding, a methodology employing Multivariate extension of EMD and Stepwise Linear Regression (MEMD-SLR) is proposed for prediction of monsoon rainfall in India. The proposed MEMD-SLR method clearly exhibited superior performance over the IMD operational forecast, M5 Model Tree (MT), and multiple linear regression methods in ISMR predictions and displayed excellent predictive skill during 1989-2012 including the four extreme events that have occurred during this period.
Björklund, M; Gustafsson, L
2017-07-01
Understanding the magnitude and long-term patterns of selection in natural populations is of importance, for example, when analysing the evolutionary impact of climate change. We estimated univariate and multivariate directional, quadratic and correlational selection on four morphological traits (adult wing, tarsus and tail length, body mass) over a time period of 33 years (≈ 19 000 observations) in a nest-box breeding population of collared flycatchers (Ficedula albicollis). In general, selection was weak in both males and females over the years regardless of fitness measure (fledged young, recruits and survival) with only few cases with statistically significant selection. When data were analysed in a multivariate context and as time series, a number of patterns emerged; there was a consistent, but weak, selection for longer wings in both sexes, selection was stronger on females when the number of fledged young was used as a fitness measure, there were no indications of sexually antagonistic selection, and we found a negative correlation between selection on tarsus and wing length in both sexes but using different fitness measures. Uni- and multivariate selection gradients were correlated only for wing length and mass. Multivariate selection gradient vectors were longer than corresponding vector of univariate gradients and had more constrained direction. Correlational selection had little importance. Overall, the fitness surface was more or less flat with few cases of significant curvature, indicating that the adaptive peak with regard to body size in this species is broader than the phenotypic distribution, which has resulted in weak estimates of selection. © 2017 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2017 European Society For Evolutionary Biology.
Patel, Ameera X; Bullmore, Edward T
2016-11-15
Connectome mapping using techniques such as functional magnetic resonance imaging (fMRI) has become a focus of systems neuroscience. There remain many statistical challenges in analysis of functional connectivity and network architecture from BOLD fMRI multivariate time series. One key statistic for any time series is its (effective) degrees of freedom, df, which will generally be less than the number of time points (or nominal degrees of freedom, N). If we know the df, then probabilistic inference on other fMRI statistics, such as the correlation between two voxel or regional time series, is feasible. However, we currently lack good estimators of df in fMRI time series, especially after the degrees of freedom of the "raw" data have been modified substantially by denoising algorithms for head movement. Here, we used a wavelet-based method both to denoise fMRI data and to estimate the (effective) df of the denoised process. We show that seed voxel correlations corrected for locally variable df could be tested for false positive connectivity with better control over Type I error and greater specificity of anatomical mapping than probabilistic connectivity maps using the nominal degrees of freedom. We also show that wavelet despiked statistics can be used to estimate all pairwise correlations between a set of regional nodes, assign a P value to each edge, and then iteratively add edges to the graph in order of increasing P. These probabilistically thresholded graphs are likely more robust to regional variation in head movement effects than comparable graphs constructed by thresholding correlations. Finally, we show that time-windowed estimates of df can be used for probabilistic connectivity testing or dynamic network analysis so that apparent changes in the functional connectome are appropriately corrected for the effects of transient noise bursts. Wavelet despiking is both an algorithm for fMRI time series denoising and an estimator of the (effective) df of denoised fMRI time series. Accurate estimation of df offers many potential advantages for probabilistically thresholding functional connectivity and network statistics tested in the context of spatially variant and non-stationary noise. Code for wavelet despiking, seed correlational testing and probabilistic graph construction is freely available to download as part of the BrainWavelet Toolbox at www.brainwavelet.org. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
De Lena, M; Barletta, A; Marzullo, F; Rabinovich, M; Leone, B; Vallejo, C; Machiavelli, M; Romero, A; Perez, J; Lacava, J; Cuevas, M A; Rodriguez, R; Schittulli, F; Paradisco, A
1996-01-01
The presence of early metastases to distant sites in breast cancer patients is an infrequent event whose mechanisms are still not clear. The aim of this study was to evaluate the biologic and clinical role of DNA ploidy and cell nuclear grade of primary tumors in the metastatic process of a series of stage IV previously untreated breast cancer patients with only visceral metastases. DNA flow cytometry analysis on paraffin-embedded material and cell nuclear grading of primary tumors was performed on a series of 50 breast cancer patients with only visceral metastases at the time of initial diagnosis. Aneuploidy was found in 28/46 (61%) of evaluable cases and was independent of site of involvement, clinical response, time of progression and overall survival of patients. Of the 46 cases evaluable for nuclear grade, 5 (11%), 16 (35%) and 25 (54%) were classified as G1 (well-differentiated) G2 and G3, respectively. Nuclear grade also was unrelated to response to therapy and overall survival, whereas time to progression was significantly longer in G1-2 than G3 tumors with the logrank test (P < 0.03) and multivariate analysis. Our results seem to stress the difficulty to individualize different prognostic subsets from a series of breast cancer patients with only visceral metastases at initial diagnosis according to DNA flow cytometry and nuclear grade.
Visibility in the topology of complex networks
NASA Astrophysics Data System (ADS)
Tsiotas, Dimitrios; Charakopoulos, Avraam
2018-09-01
Taking its inspiration from the visibility algorithm, which was proposed by Lacasa et al. (2008) to convert a time-series into a complex network, this paper develops and proposes a novel expansion of this algorithm that allows generating a visibility graph from a complex network instead of a time-series that is currently applicable. The purpose of this approach is to apply the idea of visibility from the field of time-series to complex networks in order to interpret the network topology as a landscape. Visibility in complex networks is a multivariate property producing an associated visibility graph that maps the ability of a node "to see" other nodes in the network that lie beyond the range of its neighborhood, in terms of a control-attribute. Within this context, this paper examines the visibility topology produced by connectivity (degree) in comparison with the original (source) network, in order to detect what patterns or forces describe the mechanism under which a network is converted to a visibility graph. The overall analysis shows that visibility is a property that increases the connectivity in networks, it may contribute to pattern recognition (among which the detection of the scale-free topology) and it is worth to be applied to complex networks in order to reveal the potential of signal processing beyond the range of its neighborhood. Generally, this paper promotes interdisciplinary research in complex networks providing new insights to network science.
NASA Astrophysics Data System (ADS)
Pham, M. T.; Vanhaute, W. J.; Vandenberghe, S.; De Baets, B.; Verhoest, N. E. C.
2013-12-01
Of all natural disasters, the economic and environmental consequences of droughts are among the highest because of their longevity and widespread spatial extent. Because of their extreme behaviour, studying droughts generally requires long time series of historical climate data. Rainfall is a very important variable for calculating drought statistics, for quantifying historical droughts or for assessing the impact on other hydrological (e.g. water stage in rivers) or agricultural (e.g. irrigation requirements) variables. Unfortunately, time series of historical observations are often too short for such assessments. To circumvent this, one may rely on the synthetic rainfall time series from stochastic point process rainfall models, such as Bartlett-Lewis models. The present study investigates whether drought statistics are preserved when simulating rainfall with Bartlett-Lewis models. Therefore, a 105 yr 10 min rainfall time series obtained at Uccle, Belgium is used as a test case. First, drought events were identified on the basis of the Effective Drought Index (EDI), and each event was characterized by two variables, i.e. drought duration (D) and drought severity (S). As both parameters are interdependent, a multivariate distribution function, which makes use of a copula, was fitted. Based on the copula, four types of drought return periods are calculated for observed as well as simulated droughts and are used to evaluate the ability of the rainfall models to simulate drought events with the appropriate characteristics. Overall, all Bartlett-Lewis model types studied fail to preserve extreme drought statistics, which is attributed to the model structure and to the model stationarity caused by maintaining the same parameter set during the whole simulation period.
NASA Astrophysics Data System (ADS)
Pham, M. T.; Vanhaute, W. J.; Vandenberghe, S.; De Baets, B.; Verhoest, N. E. C.
2013-06-01
Of all natural disasters, the economic and environmental consequences of droughts are among the highest because of their longevity and widespread spatial extent. Because of their extreme behaviour, studying droughts generally requires long time series of historical climate data. Rainfall is a very important variable for calculating drought statistics, for quantifying historical droughts or for assessing the impact on other hydrological (e.g. water stage in rivers) or agricultural (e.g. irrigation requirements) variables. Unfortunately, time series of historical observations are often too short for such assessments. To circumvent this, one may rely on the synthetic rainfall time series from stochastic point process rainfall models, such as Bartlett-Lewis models. The present study investigates whether drought statistics are preserved when simulating rainfall with Bartlett-Lewis models. Therefore, a 105 yr 10 min rainfall time series obtained at Uccle, Belgium is used as test case. First, drought events were identified on the basis of the Effective Drought Index (EDI), and each event was characterized by two variables, i.e. drought duration (D) and drought severity (S). As both parameters are interdependent, a multivariate distribution function, which makes use of a copula, was fitted. Based on the copula, four types of drought return periods are calculated for observed as well as simulated droughts and are used to evaluate the ability of the rainfall models to simulate drought events with the appropriate characteristics. Overall, all Bartlett-Lewis type of models studied fail in preserving extreme drought statistics, which is attributed to the model structure and to the model stationarity caused by maintaining the same parameter set during the whole simulation period.
NASA Astrophysics Data System (ADS)
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-11-01
Precipitation plays an important role in determining the climate of a region. Precise estimation of precipitation is required to manage and plan water resources, as well as other related applications such as hydrology, climatology, meteorology and agriculture. Time series of hydrologic variables such as precipitation are composed of deterministic and stochastic parts. Despite this fact, the stochastic part of the precipitation data is not usually considered in modeling of precipitation process. As an innovation, the present study introduces three new hybrid models by integrating soft computing methods including multivariate adaptive regression splines (MARS), Bayesian networks (BN) and gene expression programming (GEP) with a time series model, namely generalized autoregressive conditional heteroscedasticity (GARCH) for modeling of the monthly precipitation. For this purpose, the deterministic (obtained by soft computing methods) and stochastic (obtained by GARCH time series model) parts are combined with each other. To carry out this research, monthly precipitation data of Babolsar, Bandar Anzali, Gorgan, Ramsar, Tehran and Urmia stations with different climates in Iran were used during the period of 1965-2014. Root mean square error (RMSE), relative root mean square error (RRMSE), mean absolute error (MAE) and determination coefficient (R2) were employed to evaluate the performance of conventional/single MARS, BN and GEP, as well as the proposed MARS-GARCH, BN-GARCH and GEP-GARCH hybrid models. It was found that the proposed novel models are more precise than single MARS, BN and GEP models. Overall, MARS-GARCH and BN-GARCH models yielded better accuracy than GEP-GARCH. The results of the present study confirmed the suitability of proposed methodology for precise modeling of precipitation.
Chen, Zewei; Zhang, Xin; Zhang, Zhuoyong
2016-12-01
Timely risk assessment of chronic kidney disease (CKD) and proper community-based CKD monitoring are important to prevent patients with potential risk from further kidney injuries. As many symptoms are associated with the progressive development of CKD, evaluating risk of CKD through a set of clinical data of symptoms coupled with multivariate models can be considered as an available method for prevention of CKD and would be useful for community-based CKD monitoring. Three common used multivariate models, i.e., K-nearest neighbor (KNN), support vector machine (SVM), and soft independent modeling of class analogy (SIMCA), were used to evaluate risk of 386 patients based on a series of clinical data taken from UCI machine learning repository. Different types of composite data, in which proportional disturbances were added to simulate measurement deviations caused by environment and instrument noises, were also utilized to evaluate the feasibility and robustness of these models in risk assessment of CKD. For the original data set, three mentioned multivariate models can differentiate patients with CKD and non-CKD with the overall accuracies over 93 %. KNN and SVM have better performances than SIMCA has in this study. For the composite data set, SVM model has the best ability to tolerate noise disturbance and thus are more robust than the other two models. Using clinical data set on symptoms coupled with multivariate models has been proved to be feasible approach for assessment of patient with potential CKD risk. SVM model can be used as useful and robust tool in this study.
Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun
2018-01-01
To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.
Robust Nonlinear Causality Analysis of Nonstationary Multivariate Physiological Time Series.
Schack, Tim; Muma, Michael; Feng, Mengling; Guan, Cuntai; Zoubir, Abdelhak M
2018-06-01
An important research area in biomedical signal processing is that of quantifying the relationship between simultaneously observed time series and to reveal interactions between the signals. Since biomedical signals are potentially nonstationary and the measurements may contain outliers and artifacts, we introduce a robust time-varying generalized partial directed coherence (rTV-gPDC) function. The proposed method, which is based on a robust estimator of the time-varying autoregressive (TVAR) parameters, is capable of revealing directed interactions between signals. By definition, the rTV-gPDC only displays the linear relationships between the signals. We therefore suggest to approximate the residuals of the TVAR process, which potentially carry information about the nonlinear causality by a piece-wise linear time-varying moving-average model. The performance of the proposed method is assessed via extensive simulations. To illustrate the method's applicability to real-world problems, it is applied to a neurophysiological study that involves intracranial pressure, arterial blood pressure, and brain tissue oxygenation level (PtiO2) measurements. The rTV-gPDC reveals causal patterns that are in accordance with expected cardiosudoral meachanisms and potentially provides new insights regarding traumatic brain injuries. The rTV-gPDC is not restricted to the above problem but can be useful in revealing interactions in a broad range of applications.
Economics of Gypsum Production in Iran
NASA Astrophysics Data System (ADS)
Esmaeili, Abdoulkarim
The purpose of this research is to analyze the economics of gypsum production in Iran. The trend in production cost, selling price and profit are used to investigate economics of gypsum production. In addition, the multivariate time series method is used to determine factors affecting gypsum price in domestic market. The results indicated that due to increase in production and inflation, profitability of gypsum production has decreased during recent years. It is concluded that tariff and non-tariff barriers on mines machinery are among reasons for increasing production cost in Iranian gypsum mines. Decreasing such barriers could increase profitability of gypsum production in Iran.
PERIODIC AUTOREGRESSIVE-MOVING AVERAGE (PARMA) MODELING WITH APPLICATIONS TO WATER RESOURCES.
Vecchia, A.V.
1985-01-01
Results involving correlation properties and parameter estimation for autogressive-moving average models with periodic parameters are presented. A multivariate representation of the PARMA model is used to derive parameter space restrictions and difference equations for the periodic autocorrelations. Close approximation to the likelihood function for Gaussian PARMA processes results in efficient maximum-likelihood estimation procedures. Terms in the Fourier expansion of the parameters are sequentially included, and a selection criterion is given for determining the optimal number of harmonics to be included. Application of the techniques is demonstrated through analysis of a monthly streamflow time series.
Vedeld, Hege Marie; Merok, Marianne; Jeanmougin, Marine; Danielsen, Stine A.; Honne, Hilde; Presthus, Gro Kummeneje; Svindland, Aud; Sjo, Ole H.; Hektoen, Merete; Eknæs, Mette; Nesbakken, Arild; Lothe, Ragnhild A.
2017-01-01
The prognostic value of CpG island methylator phenotype (CIMP) in colorectal cancer remains unsettled. We aimed to assess the prognostic value of this phenotype analyzing a total of 1126 tumor samples obtained from two Norwegian consecutive colorectal cancer series. CIMP status was determined by analyzing the 5‐markers CAGNA1G, IGF2, NEUROG1, RUNX3 and SOCS1 by quantitative methylation specific PCR (qMSP). The effect of CIMP on time to recurrence (TTR) and overall survival (OS) were determined by uni‐ and multivariate analyses. Subgroup analyses were conducted according to MSI and BRAF mutation status, disease stage, and also age at time of diagnosis (<60, 60‐74, ≥75 years). Patients with CIMP positive tumors demonstrated significantly shorter TTR and worse OS compared to those with CIMP negative tumors (multivariate hazard ratio [95% CI] 1.86 [1.31‐2.63] and 1.89 [1.34‐2.65], respectively). In stratified analyses, CIMP tumors showed significantly worse outcome among patients with microsatellite stable (MSS, P < 0.001), and MSS BRAF mutated tumors (P < 0.001), a finding that persisted in patients with stage II, III or IV disease, and that remained significant in multivariate analysis (P < 0.01). Consistent results were found for all three age groups. To conclude, CIMP is significantly associated with inferior outcome for colorectal cancer patients, and can stratify the poor prognostic patients with MSS BRAF mutated tumors. PMID:28542846
Tourism demand in the Algarve region: Evolution and forecast using SVARMA models
NASA Astrophysics Data System (ADS)
Lopes, Isabel Cristina; Soares, Filomena; Silva, Eliana Costa e.
2017-06-01
Tourism is one of the Portuguese economy's key sectors, and its relative weight has grown over recent years. The Algarve region is particularly focused on attracting foreign tourists and has built over the years a large offer of diversified hotel units. In this paper we present multivariate time series approach to forecast the number of overnight stays in hotel units (hotels, guesthouses or hostels, and tourist apartments) in Algarve. We adjust a seasonal vector autoregressive and moving averages model (SVARMA) to monthly data between 2006 and 2016. The forecast values were compared with the actual values of the overnight stays in Algarve in 2016 and led to a MAPE of 15.1% and RMSE= 53847.28. The MAPE for the Hotel series was merely 4.56%. These forecast values can be used by a hotel manager to predict their occupancy and to determine the best pricing policy.
The incidence of total hip arthroplasty after hip arthroscopy in osteoarthritic patients
2010-01-01
Objective To assess the incidence of total hip arthroplasty (THA) in osteoarthritic patients who were treated by arthroscopic debridement and to evaluate factors that might influence the time interval from the first hip arthroscopy to THA. Design Retrospective clinical series Methods Follow-up data and surgical reports were retrieved from 564 records of osteoarthritic patients that have had hip arthroscopy between the years 2002 to 2009 with a mean follow-up time of 3.2 years (range, 1-6.4 years). The time interval between the first hip arthroscopy to THA was modelled as a function of patient age; level of cartilage damage; procedures performed and repeated arthroscopies with the use of multivariate regression analysis. Results Ninety (16%) of all participants eventually required THA. The awaiting time from the first arthroscopy to a hip replacement was found to be longer in patients younger than 55 years and in a milder osteoarthritic stage. Patients that experienced repeated hip scopes had a longer time to THA than those with only a single procedure. Procedures performed concomitant with debridement and lavage did not affect the time interval to THA. Conclusions In our series of arthroscopic treatment of hip osteoarthritis, 16% required THA over a period of 7 years. Factors that influence the time to arthroplasty were age, degree of osteoarthritis and recurrent procedures. PMID:20670440
Chadsuthi, Sudarat; Modchang, Charin; Lenbury, Yongwimon; Iamsirithaworn, Sopon; Triampo, Wannapong
2012-07-01
To study the number of leptospirosis cases in relations to the seasonal pattern, and its association with climate factors. Time series analysis was used to study the time variations in the number of leptospirosis cases. The Autoregressive Integrated Moving Average (ARIMA) model was used in data curve fitting and predicting the next leptospirosis cases. We found that the amount of rainfall was correlated to leptospirosis cases in both regions of interest, namely the northern and northeastern region of Thailand, while the temperature played a role in the northeastern region only. The use of multivariate ARIMA (ARIMAX) model showed that factoring in rainfall (with an 8 months lag) yields the best model for the northern region while the model, which factors in rainfall (with a 10 months lag) and temperature (with an 8 months lag) was the best for the northeastern region. The models are able to show the trend in leptospirosis cases and closely fit the recorded data in both regions. The models can also be used to predict the next seasonal peak quite accurately. Copyright © 2012 Hainan Medical College. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
von Larcher, Thomas; Harlander, Uwe; Alexandrov, Kiril; Wang, Yongtai
2010-05-01
Experiments on baroclinic wave instabilities in a rotating cylindrical gap have been long performed, e.g., to unhide regular waves of different zonal wave number, to better understand the transition to the quasi-chaotic regime, and to reveal the underlying dynamical processes of complex wave flows. We present the application of appropriate multivariate data analysis methods on time series data sets acquired by the use of non-intrusive measurement techniques of a quite different nature. While the high accurate Laser-Doppler-Velocimetry (LDV ) is used for measurements of the radial velocity component at equidistant azimuthal positions, a high sensitive thermographic camera measures the surface temperature field. The measurements are performed at particular parameter points, where our former studies show that kinds of complex wave patterns occur [1, 2]. Obviously, the temperature data set has much more information content as the velocity data set due to the particular measurement techniques. Both sets of time series data are analyzed by using multivariate statistical techniques. While the LDV data sets are studied by applying the Multi-Channel Singular Spectrum Analysis (M - SSA), the temperature data sets are analyzed by applying the Empirical Orthogonal Functions (EOF ). Our goal is (a) to verify the results yielded with the analysis of the velocity data and (b) to compare the data analysis methods. Therefor, the temperature data are processed in a way to become comparable to the LDV data, i.e. reducing the size of the data set in such a manner that the temperature measurements would imaginary be performed at equidistant azimuthal positions only. This approach initially results in a great loss of information. But applying the M - SSA to the reduced temperature data sets enable us to compare the methods. [1] Th. von Larcher and C. Egbers, Experiments on transitions of baroclinic waves in a differentially heated rotating annulus, Nonlinear Processes in Geophysics, 2005, 12, 1033-1041, NPG Print: ISSN 1023-5809, NPG Online: ISSN 1607-7946 [2] U. Harlander, Th. von Larcher, Y. Wang and C. Egbers, PIV- and LDV-measurements of baroclinic wave interactions in a thermally driven rotating annulus, Experiments in Fluids, 2009, DOI: 10.1007/s00348-009-0792-5
Enhancements of Bayesian Blocks; Application to Large Light Curve Databases
NASA Technical Reports Server (NTRS)
Scargle, Jeff
2015-01-01
Bayesian Blocks are optimal piecewise linear representations (step function fits) of light-curves. The simple algorithm implementing this idea, using dynamic programming, has been extended to include more data modes and fitness metrics, multivariate analysis, and data on the circle (Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations, Scargle, Norris, Jackson and Chiang 2013, ApJ, 764, 167), as well as new results on background subtraction and refinement of the procedure for precise timing of transient events in sparse data. Example demonstrations will include exploratory analysis of the Kepler light curve archive in a search for "star-tickling" signals from extraterrestrial civilizations. (The Cepheid Galactic Internet, Learned, Kudritzki, Pakvasa1, and Zee, 2008, arXiv: 0809.0339; Walkowicz et al., in progress).
Aboagye-Sarfo, Patrick; Mai, Qun; Sanfilippo, Frank M; Preen, David B; Stewart, Louise M; Fatovich, Daniel M
2015-10-01
To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters' models. Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters' models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters' method. VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand. Copyright © 2015 Elsevier Inc. All rights reserved.
Lunisolar tidal waves, geomagnetic activity and epilepsy in the light of multivariate coherence.
Mikulecky, M; Moravcikova, C; Czanner, S
1996-08-01
The computed daily values of lunisolar tidal waves, the observed daily values of Ap index, a measure of the planetary geomagnetic activity, and the daily numbers of patients with epileptic attacks for a group of 28 neurology patients between 1987 and 1992 were analyzed by common, multiple and partial cross-spectral analysis to search for relationships between periodicities in these time series. Significant common and multiple coherence between them was found for rhythms with a period length over 3-4 months, in agreement with seasonal variations of all three variables. If, however, the coherence between tides and epilepsy was studied excluding the influence of geomagnetism, two joint infradian periodicities with period lengths of 8.5 and 10.7 days became significant. On the other hand, there were no joint rhythms for geomagnetism and epilepsy when the influence of tidal waves was excluded. The result suggests a more primary role of gravitation, compared with geomagnetism, in the multivariate process studied.
Ecological forecasting in the presence of abrupt regime shifts
NASA Astrophysics Data System (ADS)
Dippner, Joachim W.; Kröncke, Ingrid
2015-10-01
Regime shifts may cause an intrinsic decrease in the potential predictability of marine ecosystems. In such cases, forecasts of biological variables fail. To improve prediction of long-term variability in environmental variables, we constructed a multivariate climate index and applied it to forecast ecological time series. The concept is demonstrated herein using climate and macrozoobenthos data from the southern North Sea. Special emphasis is given to the influence of selection of length of fitting period to the quality of forecast skill especially in the presence of regime shifts. Our results indicate that the performance of multivariate predictors in biological forecasts is much better than that of single large-scale climate indices, especially in the presence of regime shifts. The approach used to develop the index is generally applicable to all geographical regions in the world and to all areas of marine biology, from the species level up to biodiversity. Such forecasts are of vital interest for practical aspects of the sustainable management of marine ecosystems and the conservation of ecosystem goods and services.
State-Space Analysis of Granger-Geweke Causality Measures with Application to fMRI.
Solo, Victor
2016-05-01
The recent interest in the dynamics of networks and the advent, across a range of applications, of measuring modalities that operate on different temporal scales have put the spotlight on some significant gaps in the theory of multivariate time series. Fundamental to the description of network dynamics is the direction of interaction between nodes, accompanied by a measure of the strength of such interactions. Granger causality and its associated frequency domain strength measures (GEMs) (due to Geweke) provide a framework for the formulation and analysis of these issues. In pursuing this setup, three significant unresolved issues emerge. First, computing GEMs involves computing submodels of vector time series models, for which reliable methods do not exist. Second, the impact of filtering on GEMs has never been definitively established. Third, the impact of downsampling on GEMs has never been established. In this work, using state-space methods, we resolve all these issues and illustrate the results with some simulations. Our analysis is motivated by some problems in (fMRI) brain imaging, to which we apply it, but it is of general applicability.
State-Space Analysis of Granger-Geweke Causality Measures with Application to fMRI
Solo, Victor
2017-01-01
The recent interest in the dynamics of networks and the advent, across a range of applications, of measuring modalities that operate on different temporal scales have put the spotlight on some significant gaps in the theory of multivariate time series. Fundamental to the description of network dynamics is the direction of interaction between nodes, accompanied by a measure of the strength of such interactions. Granger causality and its associated frequency domain strength measures (GEMs) (due to Geweke) provide a framework for the formulation and analysis of these issues. In pursuing this setup, three significant unresolved issues emerge. First, computing GEMs involves computing submodels of vector time series models, for which reliable methods do not exist. Second, the impact of filtering on GEMs has never been definitively established. Third, the impact of downsampling on GEMs has never been established. In this work, using state-space methods, we resolve all these issues and illustrate the results with some simulations. Our analysis is motivated by some problems in (fMRI) brain imaging, to which we apply it, but it is of general applicability. PMID:26942749
Bostanmaneshrad, Farshid; Partani, Sadegh; Noori, Roohollah; Nachtnebel, Hans-Peter; Berndtsson, Ronny; Adamowski, Jan Franklin
2018-10-15
To date, few studies have investigated the simultaneous effects of macro-scale parameters (MSPs) such as land use, population density, geology, and erosion layers on micro-scale water quality variables (MSWQVs). This research focused on an evaluation of the relationship between MSPs and MSWQVs in the Siminehrood River Basin, Iran. In addition, we investigated the importance of water particle travel time (hydrological distance) on this relationship. The MSWQVs included 13 physicochemical and biochemical parameters observed at 15 stations during three seasons. Primary screening was performed by utilizing three multivariate statistical analyses (Pearson's correlation, cluster and discriminant analyses) in seven series of observed data. These series included three separate seasonal data, three two-season data, and aggregated three-season data for investigation of relationships between MSPs and MSWQVs. Coupled data (pairs of MSWQVs and MSPs) repeated in at least two out of three statistical analyses were selected for final screening. The primary screening results demonstrated significant relationships between land use and phosphorus, total solids and turbidity, erosion levels and electrical conductivity, and erosion and total solids. Furthermore, water particle travel time effects were considered through three geographical pattern definitions of distance for each MSP by using two weighting methods. To find effective MSP factors on MSWQVs, a multivariate linear regression analysis was employed. Then, preliminary equations that estimated MSWQVs were developed. The preliminary equations were modified to adaptive equations to obtain the final models. The final models indicated that a new metric, referred to as hydrological distance, provided better MSWQV estimation and water quality prediction compared to the National Sanitation Foundation Water Quality Index. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Jerlström, Tomas; Gårdmark, Truls; Carringer, Malcolm; Holmäng, Sten; Liedberg, Fredrik; Hosseini, Abolfazl; Malmström, Per-Uno; Ljungberg, Börje; Hagberg, Oskar; Jahnson, Staffan
2014-08-01
Cystectomy combined with pelvic lymph-node dissection and urinary diversion entails high morbidity and mortality. Improvements are needed, and a first step is to collect information on the current situation. In 2011, this group took the initiative to start a population-based database in Sweden (population 9.5 million in 2011) with prospective registration of patients and complications until 90 days after cystectomy. This article reports findings from the first year of registration. Participation was voluntary, and data were reported by local urologists or research nurses. Perioperative parameters and early complications classified according to the modified Clavien system were registered, and selected variables of possible importance for complications were analysed by univariate and multivariate logistic regression. During 2011, 285 (65%) of 435 cystectomies performed in Sweden were registered in the database, the majority reported by the seven academic centres. Median blood loss was 1000 ml, operating time 318 min, and length of hospital stay 15 days. Any complications were registered for 103 patients (36%). Clavien grades 1-2 and 3-5 were noted in 19% and 15%, respectively. Thirty-seven patients (13%) were reoperated on at least once. In logistic regression analysis elevated risk of complications was significantly associated with operating time exceeding 318 min in both univariate and multivariate analysis, and with age 76-89 years only in multivariate analysis. It was feasible to start a national population-based registry of radical cystectomies for bladder cancer. The evaluation of the first year shows an increased risk of complications in patients with longer operating time and higher age. The results agree with some previously published series but should be interpreted with caution considering the relatively low coverage, which is expected to be higher in the future.
Using state-space models to predict the abundance of juvenile and adult sea lice on Atlantic salmon.
Elghafghuf, Adel; Vanderstichel, Raphael; St-Hilaire, Sophie; Stryhn, Henrik
2018-04-11
Sea lice are marine parasites affecting salmon farms, and are considered one of the most costly pests of the salmon aquaculture industry. Infestations of sea lice on farms significantly increase opportunities for the parasite to spread in the surrounding ecosystem, making control of this pest a challenging issue for salmon producers. The complexity of controlling sea lice on salmon farms requires frequent monitoring of the abundance of different sea lice stages over time. Industry-based data sets of counts of lice are amenable to multivariate time-series data analyses. In this study, two sets of multivariate autoregressive state-space models were applied to Chilean sea lice data from six Atlantic salmon production cycles on five isolated farms (at least 20 km seaway distance away from other known active farms), to evaluate the utility of these models for predicting sea lice abundance over time on farms. The models were constructed with different parameter configurations, and the analysis demonstrated large heterogeneity between production cycles for the autoregressive parameter, the effects of chemotherapeutant bath treatments, and the process-error variance. A model allowing for different parameters across production cycles had the best fit and the smallest overall prediction errors. However, pooling information across cycles for the drift and observation error parameters did not substantially affect model performance, thus reducing the number of necessary parameters in the model. Bath treatments had strong but variable effects for reducing sea lice burdens, and these effects were stronger for adult lice than juvenile lice. Our multivariate state-space models were able to handle different sea lice stages and provide predictions for sea lice abundance with reasonable accuracy up to five weeks out. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Attractor States in Teaching and Learning Processes: A Study of Out-of-School Science Education.
Geveke, Carla H; Steenbeek, Henderien W; Doornenbal, Jeannette M; Van Geert, Paul L C
2017-01-01
In order for out-of-school science activities that take place during school hours but outside the school context to be successful, instructors must have sufficient pedagogical content knowledge (PCK) to guarantee high-quality teaching and learning. We argue that PCK is a quality of the instructor-pupil system that is constructed in real-time interaction. When PCK is evident in real-time interaction, we define it as Expressed Pedagogical Content Knowledge (EPCK). The aim of this study is to empirically explore whether EPCK shows a systematic pattern of variation, and if so whether the pattern occurs in recurrent and temporary stable attractor states as predicted in the complex dynamic systems theory. This study concerned nine out-of-school activities in which pupils of upper primary school classes participated. A multivariate coding scheme was used to capture EPCK in real time. A principal component analysis of the time series of all the variables reduced the number of components. A cluster revealed general descriptions of the components across all cases. Cluster analyses of individual cases divided the time series into sequences, revealing High-, Low-, and Non-EPCK states. High-EPCK attractor states emerged at particular moments during activities, rather than being present all the time. Such High-EPCK attractor states were only found in a few cases, namely those where the pupils were prepared for the visit and the instructors were trained.
Modeling Individual Cyclic Variation in Human Behavior.
Pierson, Emma; Althoff, Tim; Leskovec, Jure
2018-04-01
Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, and the menstrual cycle. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyH-MMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with both discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to accommodate variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets-of human menstrual cycle symptoms and physical activity tracking data-yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.
Modeling Individual Cyclic Variation in Human Behavior
Pierson, Emma; Althoff, Tim; Leskovec, Jure
2018-01-01
Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, and the menstrual cycle. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyH-MMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with both discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to accommodate variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets—of human menstrual cycle symptoms and physical activity tracking data—yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model. PMID:29780976
Untangling Trends and Drivers of Changing River Discharge Along Florida's Gulf Coast
NASA Astrophysics Data System (ADS)
Glodzik, K.; Kaplan, D. A.; Klarenberg, G.
2017-12-01
Along the relatively undeveloped Big Bend coastline of Florida, discharge in many rivers and springs is decreasing. The causes are unclear, though they likely include a combination of groundwater extraction for water supply, climate variability, and altered land use. Saltwater intrusion from altered freshwater influence and sea level rise is causing transformative ecosystem impacts along this flat coastline, including coastal forest die-off and oyster reef collapse. A key uncertainty for understanding river discharge change is predicting discharge from rainfall, since Florida's karstic bedrock stores large amounts of groundwater, which has a long residence time. This study uses Dynamic Factor Analysis (DFA), a multivariate data reduction technique for time series, to find common trends in flow and reveal hydrologic variables affecting flow in eight Big Bend rivers since 1965. The DFA uses annual river flows as response time series, and climate data (annual rainfall and evapotranspiration by watershed) and climatic indices (El Niño Southern Oscillation [ENSO] Index and North Atlantic Oscillation [NAO] Index) as candidate explanatory variables. Significant explanatory variables (one evapotranspiration and three rainfall time series) explained roughly 50% of discharge variation across rivers. Significant trends (representing unexplained variation) were shared among rivers, with geographical grouping of five northern rivers and three southern rivers, along with a strong downward trend affecting six out of eight systems. ENSO and NAO had no significant impact. Advancing knowledge of these dynamics is necessary for forecasting how altered rainfall and temperatures from climate change may impact flows. Improved forecasting is especially important given Florida's reliance on groundwater extraction to support its growing population.
NASA Astrophysics Data System (ADS)
Gruszczynska, M.; Rosat, S.; Klos, A.; Bogusz, J.
2017-12-01
In this study, Singular Spectrum Analysis (SSA) along with its multivariate extension MSSA (Multichannel SSA) were used to estimate long-term trend and gravimetric factor at the Chandler wobble frequency from superconducting gravimeter (SG) records. We have used data from seven stations located worldwide and contributing to the International Geodynamics and Earth Tides Service (IGETS). The timespan ranged from 15 to 19 years. Before applying SSA and MSSA, we had removed local tides, atmospheric (ECMWF data), hydrological (MERRA2 products) loadings and non-tidal ocean loading (ECCO2 products) effects. In the first part of analysis, we used the SSA approach in order to estimate the long-term trends from SG observations. We use the technique based on the classical Karhunen-Loève spectral decomposition of time series into long-term trend, oscillations and noise. In the second part, we present the determination of common time-varying pole tide (annual and Chandler wobble) to estimate gravimetric factor from SG time series using the MSSA approach. The presented method takes advantage over traditional methods like Least Squares Estimation by determining common modes of variability which reflect common geophysical field. We adopted a 6-year lag-window as the optimal length to extract common seasonal signals and the Chandler components of the Earth polar motion. The signals characterized by annual and Chandler wobble account for approximately 62% of the total variance of residual SG data. Then, we estimated the amplitude factors and phase lags of Chandler wobble with respect to the IERS (International Earth Rotation and Reference Systems Service) polar motion observations. The resulting gravimetric factors at the Chandler Wobble period are finally compared with previously estimates. A robust estimate of the gravimetric Earth response to the Chandlerian component of the polar motion is required to better constrain the mantle anelasticity at this frequency and hence the attenuation models of the Earth interior.
Hakoun, Vivien; Orban, Philippe; Dassargues, Alain; Brouyère, Serge
2017-04-01
Factors governing spatial and temporal patterns of pesticide compounds (pesticides and metabolites) concentrations in chalk aquifers remain unclear due to complex flow processes and multiple sources. To uncover which factors govern pesticide compound concentrations in a chalk aquifer, we develop a methodology based on time series analyses, uni- and multivariate statistics accounting for concentrations below detection limits. The methodology is applied to long records (1996-2013) of a restricted compound (bentazone), three banned compounds (atrazine, diuron and simazine) and two metabolites (deethylatrazine (DEA) and 2,6-dichlorobenzamide (BAM)) sampled in the Hesbaye chalk aquifer in Belgium. In the confined area, all compounds had non-detects fractions >80%. By contrast, maximum concentrations exceeded EU's drinking-water standard (100 ng L -1 ) in the unconfined area. This contrast confirms that recent recharge and polluted water did not reach the confined area, yet. Multivariate analyses based on variables representative of the hydrogeological setting revealed higher diuron and simazine concentrations in the southeast of the unconfined area, where urban activities dominate land use and where the aquifer lacks protection from a less permeable layer of hardened chalk. At individual sites, positive correlations (up to τ=0.48 for bentazone) between pesticide compound concentrations and multi-annual groundwater level fluctuations confirm occurrences of remobilization. A downward temporal trend of atrazine concentrations likely reflects decreasing use of this compound over the last 28 years. However, the lack of a break in concentrations time series and maximum concentrations of atrazine, simazine, DEA and BAM exceeding EU's standard post-ban years provide evidence of persistence. Contrasting upward trends in bentazone concentrations show that a time lag is required for restriction measures to be efficient. These results shed light on factors governing pesticide compound concentrations in chalk aquifers. The developed methodology is not restricted to chalk aquifers, it could be transposed to study other pollutants with concentrations below detection limits. Copyright © 2017 Elsevier Ltd. All rights reserved.
Onozuka, Daisuke; Hagihara, Akihito
2015-07-01
Although the impact of extreme heat and cold on mortality has been documented in recent years, few studies have investigated whether variation in susceptibility to extreme temperatures has changed in Japan. We used data on daily total mortality and mean temperatures in Fukuoka, Japan, for 1973-2012. We used time-series analysis to assess the effects of extreme hot and low temperatures on all-cause mortality, stratified by decade, gender, and age, adjusting for time trends. We used a multivariate meta-analysis with a distributed lag non-linear model to estimate pooled non-linear lag-response relationships associated with extreme temperatures on mortality. The relative risk of mortality increased during heat extremes in all decades, with a declining trend over time. The mortality risk was higher during cold extremes for the entire study period, with a dispersed pattern across decades. Meta-analysis showed that both heat and cold extremes increased the risk of mortality. Cold effects were delayed and lasted for several days, whereas heat effects appeared quickly and did not last long. Our study provides quantitative evidence that extreme heat and low temperatures were significantly and non-linearly associated with the increased risk of mortality with substantial variation. Our results suggest that timely preventative measures are important for extreme high temperatures, whereas several days' protection should be provided for extreme low temperatures. Copyright © 2015 Elsevier Inc. All rights reserved.
Lawes, Timothy; Edwards, Becky; López-Lozano, José-Maria; Gould, Ian
2012-01-01
To describe secular trends in Staphylococcus aureus bacteraemia (SAB) and to assess the impacts of infection control practices, including universal methicillin-resistant Staphylococcus aureus (MRSA) admission screening on associated clinical burdens. Retrospective cohort study and multivariate time-series analysis linking microbiology, patient management and health intelligence databases. Teaching hospital in North East Scotland. All patients admitted to Aberdeen Royal Infirmary between 1 January 2006 and 31 December 2010: n=420 452 admissions and 1 430 052 acute occupied bed days (AOBDs). Universal admission screening programme for MRSA (August 2008) incorporating isolation and decolonisation. PRIMARY AND SECONDARY MEASURES: Hospital-wide prevalence density, hospital-associated incidence density and death within 30 days of MRSA or methicillin-sensitive Staphylococcus aureus (MSSA) bacteraemia. Between 2006 and 2010, prevalence density of all SAB declined by 41%, from 0.73 to 0.50 cases/1000 AOBDs (p=0.002 for trend), and 30-day mortality from 26% to 14% (p=0.013). Significant reductions were observed in MRSA bacteraemia only. Overnight admissions screened for MRSA rose from 43% during selective screening to >90% within 4 months of universal screening. In multivariate time-series analysis (R(2) 0.45 to 0.68), universal screening was associated with a 19% reduction in prevalence density of MRSA bacteraemia (-0.035, 95% CI -0.049 to -0.021/1000 AOBDs; p<0.001), a 29% fall in hospital-associated incidence density (-0.029, 95% CI -0.035 to -0.023/1000 AOBDs; p<0.001) and a 46% reduction in 30-day mortality (-15.6, 95% CI -24.1% to -7.1%; p<0.001). Positive associations with fluoroquinolone and cephalosporin use suggested that antibiotic stewardship reduced prevalence density of MRSA bacteraemia by 0.027 (95% CI 0.015 to 0.039)/1000 AOBDs. Rates of MSSA bacteraemia were not significantly affected by screening or antibiotic use. Declining clinical burdens from SAB were attributable to reductions in MRSA infections. Universal admission screening and antibiotic stewardship were associated with decreases in MRSA bacteraemia and associated early mortality. Control of MSSA bacteraemia remains a priority.
Vedeld, Hege Marie; Merok, Marianne; Jeanmougin, Marine; Danielsen, Stine A; Honne, Hilde; Presthus, Gro Kummeneje; Svindland, Aud; Sjo, Ole H; Hektoen, Merete; Eknaes, Mette; Nesbakken, Arild; Lothe, Ragnhild A; Lind, Guro E
2017-09-01
The prognostic value of CpG island methylator phenotype (CIMP) in colorectal cancer remains unsettled. We aimed to assess the prognostic value of this phenotype analyzing a total of 1126 tumor samples obtained from two Norwegian consecutive colorectal cancer series. CIMP status was determined by analyzing the 5-markers CAGNA1G, IGF2, NEUROG1, RUNX3 and SOCS1 by quantitative methylation specific PCR (qMSP). The effect of CIMP on time to recurrence (TTR) and overall survival (OS) were determined by uni- and multivariate analyses. Subgroup analyses were conducted according to MSI and BRAF mutation status, disease stage, and also age at time of diagnosis (<60, 60-74, ≥75 years). Patients with CIMP positive tumors demonstrated significantly shorter TTR and worse OS compared to those with CIMP negative tumors (multivariate hazard ratio [95% CI] 1.86 [1.31-2.63] and 1.89 [1.34-2.65], respectively). In stratified analyses, CIMP tumors showed significantly worse outcome among patients with microsatellite stable (MSS, P < 0.001), and MSS BRAF mutated tumors (P < 0.001), a finding that persisted in patients with stage II, III or IV disease, and that remained significant in multivariate analysis (P < 0.01). Consistent results were found for all three age groups. To conclude, CIMP is significantly associated with inferior outcome for colorectal cancer patients, and can stratify the poor prognostic patients with MSS BRAF mutated tumors. © 2017 The Authors International Journal of Cancer published by John Wiley & Sons Ltd on behalf of UICC.
Jupiter, Daniel C
2012-01-01
In this first of a series of statistical methodology commentaries for the clinician, we discuss the use of multivariate linear regression. Copyright © 2012 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
The relevance of timing in nonconvulsive status epilepticus: A series of 38 cases.
Gutiérrez-Viedma, Álvaro; Parejo-Carbonell, Beatriz; Cuadrado, María-Luz; Serrano-García, Irene; Abarrategui, Belén; García-Morales, Irene
2018-05-01
Timing in the management of nonconvulsive status epilepticus (NCSE) seems to be one of the most important modifiable prognostic factors. We aimed to determine the precise relationship between timing in NCSE management and its outcome. We performed a cross-sectional study in which clinical data were prospectively obtained from all consecutive adults with NCSE admitted to our hospital from 2014 to 2016. Univariate and multivariable regression analyses were performed to identify clinical and timing variables associated with NCSE prognosis. Among 38 NCSE cases, 59.9% were women, and 39.5% had prior epilepsy history. The median time to treatment (TTT) initiation and the median time to assessment by a neurologist (TTN) were 5h, and the median time to first electroencephalography assessment was 18.5h; in the cases with out-of-hospital onset (n=24), the median time to hospital (TTH) arrival was 2.8h. The median time to NCSE control (TTC) was 16.5h, and it positively correlated with both the TTH (Spearman's rho: 0.439) and the TTT (Spearman's rho: 0.683). In the multivariable regression analyses, the TTC was extended 1.7h for each hour of hospital arrival delay (p=0.01) and 2.7h for each hour of treatment delay (p<0.001). Recognition delay was more common in the episodes with in-hospital onset, which also had longer TTN and TTC, and increased morbidity. There were pervasive delays in all phases of NCSE management. Delays in hospital arrival or treatment initiation may result in prolonged TTC. Recognition of in-hospital episodes may be more delayed, which may lead to poorer prognosis in these cases. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Dodson, J. B.; Taylor, P. C.
2016-12-01
The diurnal cycle of convection (CDC) greatly influences the water, radiative, and energy budgets in convectively active regions. For example, previous research of the Amazonian CDC has identified significant monthly covariability between the satellite-observed radiative and precipitation diurnal and multiple reanalysis-derived atmospheric state variables (ASVs) representing convective instability. However, disagreements between retrospective analysis products (reanalyses) over monthly ASV anomalies create significant uncertainty in the resulting covariability. Satellite observations of convective clouds can be used to characterize monthly anomalies in convective activity. CloudSat observes multiple properties of both deep convective cores and the associated anvils, and so is useful as an alternative to the use of reanalyses. CloudSat cannot observe the full diurnal cycle, but it can detect differences between daytime and nighttime convection. Initial efforts to use CloudSat data to characterize convective activity showed that the results are highly dependent on the choice of variable used to characterize the cloud. This is caused by a series of inverse relationships between convective frequency, cloud top height, radar reflectivity vertical profile, and other variables. A single, multi-variable index for convective activity based on CloudSat data may be useful to clarify the results. Principal component analysis (PCA) provides a method to create a multivariable index, where the first principal component (PC1) corresponds with convective instability. The time series of PC1 can then be used as a proxy for monthly variability in convective activity. The primary challenge presented involves determining the utility of PCA for creating a robust index for convective activity that accounts for the complex relationships of multiple convective cloud variables, and yields information about the interactions between convection, the convective environment, and radiation beyond the previous single-variable approaches. The choice of variables used to calculate PC1 may influence any results based on PC1, so it is necessary to test the sensitivity of the results to different variable combinations.
NASA Astrophysics Data System (ADS)
Anwar, Faizan; Bárdossy, András; Seidel, Jochen
2017-04-01
Estimating missing values in a time series of a hydrological variable is an everyday task for a hydrologist. Existing methods such as inverse distance weighting, multivariate regression, and kriging, though simple to apply, provide no indication of the quality of the estimated value and depend mainly on the values of neighboring stations at a given step in the time series. Copulas have the advantage of representing the pure dependence structure between two or more variables (given the relationship between them is monotonic). They rid us of questions such as transforming the data before use or calculating functions that model the relationship between the considered variables. A copula-based approach is suggested to infill discharge, precipitation, and temperature data. As a first step the normal copula is used, subsequently, the necessity to use non-normal / non-symmetrical dependence is investigated. Discharge and temperature are treated as regular continuous variables and can be used without processing for infilling and quality checking. Due to the mixed distribution of precipitation values, it has to be treated differently. This is done by assigning a discrete probability to the zeros and treating the rest as a continuous distribution. Building on the work of others, along with infilling, the normal copula is also utilized to identify values in a time series that might be erroneous. This is done by treating the available value as missing, infilling it using the normal copula and checking if it lies within a confidence band (5 to 95% in our case) of the obtained conditional distribution. Hydrological data from two catchments Upper Neckar River (Germany) and Santa River (Peru) are used to demonstrate the application for datasets with different data quality. The Python code used here is also made available on GitHub. The required input is the time series of a given variable at different stations.
Forecasting daily attendances at an emergency department to aid resource planning
Sun, Yan; Heng, Bee Hoon; Seow, Yian Tay; Seow, Eillyne
2009-01-01
Background Accurate forecasting of emergency department (ED) attendances can be a valuable tool for micro and macro level planning. Methods Data for analysis was the counts of daily patient attendances at the ED of an acute care regional general hospital from July 2005 to Mar 2008. Patients were stratified into three acuity categories; i.e. P1, P2 and P3, with P1 being the most acute and P3 being the least acute. The autoregressive integrated moving average (ARIMA) method was separately applied to each of the three acuity categories and total patient attendances. Independent variables included in the model were public holiday (yes or no), ambient air quality measured by pollution standard index (PSI), daily ambient average temperature and daily relative humidity. The seasonal components of weekly and yearly periodicities in the time series of daily attendances were also studied. Univariate analysis by t-tests and multivariate time series analysis were carried out in SPSS version 15. Results By time series analyses, P1 attendances did not show any weekly or yearly periodicity and was only predicted by ambient air quality of PSI > 50. P2 and total attendances showed weekly periodicities, and were also significantly predicted by public holiday. P3 attendances were significantly correlated with day of the week, month of the year, public holiday, and ambient air quality of PSI > 50. After applying the developed models to validate the forecast, the MAPE of prediction by the models were 16.8%, 6.7%, 8.6% and 4.8% for P1, P2, P3 and total attendances, respectively. The models were able to account for most of the significant autocorrelations present in the data. Conclusion Time series analysis has been shown to provide a useful, readily available tool for predicting emergency department workload that can be used to plan staff roster and resource planning. PMID:19178716
Employment as a health promotion intervention for persons with multiple sclerosis.
Chiu, Chung-Yi; Chan, Fong; Edward Sharp, Seneca; Dutta, Alo; Hartman, Ellie; Bezyak, Jill
2015-01-01
To examine the relationship between employment status (no employment, part-time employment, and full-time employment) and functional disability, health-related quality of life, and life satisfaction of people with MS. 157 individuals with MS completed a survey packet, including employment status, self-report disability severity, and health-related scales. A series of multivariate analysis of variance was performed to determine the differences between employment groups in health-related outcomes. The unemployed group had the highest levels of incapacity and social impairments among the three groups. They also had the lowest physical health-related quality of life and life satisfaction. The part-time employed group had the lowest levels of depression and higher levels of physical activity participation among the three groups of individuals with MS. Employment is significantly related to health-related quality of life, and as a result, it should be considered an important public health intervention for people with MS.
Tamayo Uria, Ibon; Mateu Mahiques, Jorge; Mughini Gras, Lapo
2013-06-01
Urban Norway rats are challenging pests, posing significant health and economic threats. Implementing ecologically based integrated rodent management (EBIRM) programmes relies primarily on the understanding of ecological relationships between rodents and their environments, with emphasis on the processes influencing rodent populations in the target ecosystem. We investigated the temporal distribution of urban Norway rat infestations in Madrid, Spain, and tested for the association of such infestations with temperature, relative humidity and precipitation by fitting a multivariate Poisson generalized linear model to a 3-year (2006-2008) daily time series of 4,689 Norway rat sightings. Norway rat infestations showed a marked seasonality, peaking in the summer. Most Norway rat sightings were reported on Mondays. Minimum temperature and relative humidity were positively associated with Norway rat infestation, whereas the association with precipitation was negative. The time series was adequately explained by the model. We identified previously unrecognized time periods that are more prone to Norway rat infestation than others and generated hypotheses about the association between weather, human outdoor activity, resource availability, rodent activity and population size. This provided local authorities engaged in preserving urban ecosystem health with basic research information to predict future rodent outbreaks and support the implementation of EBIRM programmes in urban areas.
Adam Smith in the Mathematics Classroom
ERIC Educational Resources Information Center
Lipsey, Sally I.
1975-01-01
The author describes a series of current economic ideas and situations which can be used in the mathematics classroom to illustrate the use of signed numbers, the coordinate system, univariate and multivariate functions, linear programing, and variation. (SD)
A false dichotomy? Mental illness and lone-actor terrorism.
Corner, Emily; Gill, Paul
2015-02-01
We test whether significant differences in mental illness exist in a matched sample of lone- and group-based terrorists. We then test whether there are distinct behavioral differences between lone-actor terrorists with and without mental illness. We then stratify our sample across a range of diagnoses and again test whether significant differences exist. We conduct a series of bivariate, multivariate, and multinomial statistical tests using a unique dataset of 119 lone-actor terrorists and a matched sample of group-based terrorists. The odds of a lone-actor terrorist having a mental illness is 13.49 times higher than the odds of a group actor having a mental illness. Lone actors who were mentally ill were 18.07 times more likely to have a spouse or partner who was involved in a wider movement than those without a history of mental illness. Those with a mental illness were more likely to have a proximate upcoming life change, more likely to have been a recent victim of prejudice, and experienced proximate and chronic stress. The results identify behaviors and traits that security agencies can utilize to monitor and prevent lone-actor terrorism events. The correlated behaviors provide an image of how risk can crystalize within the individual offender and that our understanding of lone-actor terrorism should be multivariate in nature.
Roehl, Edwin A.; Conrads, Paul
2010-01-01
This is the second of two papers that describe how data mining can aid natural-resource managers with the difficult problem of controlling the interactions between hydrologic and man-made systems. Data mining is a new science that assists scientists in converting large databases into knowledge, and is uniquely able to leverage the large amounts of real-time, multivariate data now being collected for hydrologic systems. Part 1 gives a high-level overview of data mining, and describes several applications that have addressed major water resource issues in South Carolina. This Part 2 paper describes how various data mining methods are integrated to produce predictive models for controlling surface- and groundwater hydraulics and quality. The methods include: - signal processing to remove noise and decompose complex signals into simpler components; - time series clustering that optimally groups hundreds of signals into "classes" that behave similarly for data reduction and (or) divide-and-conquer problem solving; - classification which optimally matches new data to behavioral classes; - artificial neural networks which optimally fit multivariate data to create predictive models; - model response surface visualization that greatly aids in understanding data and physical processes; and, - decision support systems that integrate data, models, and graphics into a single package that is easy to use.
Attractor States in Teaching and Learning Processes: A Study of Out-of-School Science Education
Geveke, Carla H.; Steenbeek, Henderien W.; Doornenbal, Jeannette M.; Van Geert, Paul L. C.
2017-01-01
In order for out-of-school science activities that take place during school hours but outside the school context to be successful, instructors must have sufficient pedagogical content knowledge (PCK) to guarantee high-quality teaching and learning. We argue that PCK is a quality of the instructor-pupil system that is constructed in real-time interaction. When PCK is evident in real-time interaction, we define it as Expressed Pedagogical Content Knowledge (EPCK). The aim of this study is to empirically explore whether EPCK shows a systematic pattern of variation, and if so whether the pattern occurs in recurrent and temporary stable attractor states as predicted in the complex dynamic systems theory. This study concerned nine out-of-school activities in which pupils of upper primary school classes participated. A multivariate coding scheme was used to capture EPCK in real time. A principal component analysis of the time series of all the variables reduced the number of components. A cluster revealed general descriptions of the components across all cases. Cluster analyses of individual cases divided the time series into sequences, revealing High-, Low-, and Non-EPCK states. High-EPCK attractor states emerged at particular moments during activities, rather than being present all the time. Such High-EPCK attractor states were only found in a few cases, namely those where the pupils were prepared for the visit and the instructors were trained. PMID:28316578
Assessment of resampling methods for causality testing: A note on the US inflation behavior
Kyrtsou, Catherine; Kugiumtzis, Dimitris; Diks, Cees
2017-01-01
Different resampling methods for the null hypothesis of no Granger causality are assessed in the setting of multivariate time series, taking into account that the driving-response coupling is conditioned on the other observed variables. As appropriate test statistic for this setting, the partial transfer entropy (PTE), an information and model-free measure, is used. Two resampling techniques, time-shifted surrogates and the stationary bootstrap, are combined with three independence settings (giving a total of six resampling methods), all approximating the null hypothesis of no Granger causality. In these three settings, the level of dependence is changed, while the conditioning variables remain intact. The empirical null distribution of the PTE, as the surrogate and bootstrapped time series become more independent, is examined along with the size and power of the respective tests. Additionally, we consider a seventh resampling method by contemporaneously resampling the driving and the response time series using the stationary bootstrap. Although this case does not comply with the no causality hypothesis, one can obtain an accurate sampling distribution for the mean of the test statistic since its value is zero under H0. Results indicate that as the resampling setting gets more independent, the test becomes more conservative. Finally, we conclude with a real application. More specifically, we investigate the causal links among the growth rates for the US CPI, money supply and crude oil. Based on the PTE and the seven resampling methods, we consistently find that changes in crude oil cause inflation conditioning on money supply in the post-1986 period. However this relationship cannot be explained on the basis of traditional cost-push mechanisms. PMID:28708870
Assessment of resampling methods for causality testing: A note on the US inflation behavior.
Papana, Angeliki; Kyrtsou, Catherine; Kugiumtzis, Dimitris; Diks, Cees
2017-01-01
Different resampling methods for the null hypothesis of no Granger causality are assessed in the setting of multivariate time series, taking into account that the driving-response coupling is conditioned on the other observed variables. As appropriate test statistic for this setting, the partial transfer entropy (PTE), an information and model-free measure, is used. Two resampling techniques, time-shifted surrogates and the stationary bootstrap, are combined with three independence settings (giving a total of six resampling methods), all approximating the null hypothesis of no Granger causality. In these three settings, the level of dependence is changed, while the conditioning variables remain intact. The empirical null distribution of the PTE, as the surrogate and bootstrapped time series become more independent, is examined along with the size and power of the respective tests. Additionally, we consider a seventh resampling method by contemporaneously resampling the driving and the response time series using the stationary bootstrap. Although this case does not comply with the no causality hypothesis, one can obtain an accurate sampling distribution for the mean of the test statistic since its value is zero under H0. Results indicate that as the resampling setting gets more independent, the test becomes more conservative. Finally, we conclude with a real application. More specifically, we investigate the causal links among the growth rates for the US CPI, money supply and crude oil. Based on the PTE and the seven resampling methods, we consistently find that changes in crude oil cause inflation conditioning on money supply in the post-1986 period. However this relationship cannot be explained on the basis of traditional cost-push mechanisms.
Prognostic importance of DNA ploidy in non-endometrioid, high-risk endometrial carcinomas.
Sorbe, Bengt
2016-03-01
The present study investigated the predictive and prognostic impact of DNA ploidy together with other well-known prognostic factors in a series of non-endometrioid, high-risk endometrial carcinomas. From a complete consecutive series of 4,543 endometrial carcinomas of International Federation of Gynecology and Obstetrics (FIGO) stages I-IV, 94 serous carcinomas, 48 clear cell carcinomas and 231 carcinosarcomas were selected as a non-endometrioid, high-risk group for further studies regarding prognosis. The impact of DNA ploidy, as assessed by flow cytometry, was of particular focus. The age of the patients, FIGO stage, depth of myometrial infiltration and tumor expression of p53 were also included in the analyses (univariate and multivariate). In the complete series of cases, the recurrence rate was 37%, and the 5-year overall survival rate was 39% with no difference between the three histological subtypes. The primary cure rate (78%) was also similar for all tumor types studied. DNA ploidy was a significant predictive factor (on univariate analysis) for primary tumor cure rate, and a prognostic factor for survival rate (on univariate and multivariate analyses). The predictive and prognostic impact of DNA ploidy was higher in carcinosarcomas than in serous and clear cell carcinomas. In the majority of multivariate analyses, FIGO stage and depth of myometrial infiltration were the most important predictive (tumor recurrence) and prognostic (survival rate) factors. DNA ploidy status is a less important predictive and prognostic factor in non-endometrioid, high-risk endometrial carcinomas than in the common endometrioid carcinomas, in which FIGO and nuclear grade also are highly significant and important factors.
NASA Astrophysics Data System (ADS)
Ahmadijamal, M.; Hasanlou, M.
2017-09-01
Study of hydrological parameters of lakes and examine the variation of water level to operate management on water resources are important. The purpose of this study is to investigate and model the Urmia Lake water level changes due to changes in climatically and hydrological indicators that affects in the process of level variation and area of this lake. For this purpose, Landsat satellite images, hydrological data, the daily precipitation, the daily surface evaporation and the daily discharge in total of the lake basin during the period of 2010-2016 have been used. Based on time-series analysis that is conducted on individual data independently with same procedure, to model variation of Urmia Lake level, we used polynomial regression technique and combined polynomial with periodic behavior. In the first scenario, we fit a multivariate linear polynomial to our datasets and determining RMSE, NRSME and R² value. We found that fourth degree polynomial can better fit to our datasets with lowest RMSE value about 9 cm. In the second scenario, we combine polynomial with periodic behavior for modeling. The second scenario has superiority comparing to the first one, by RMSE value about 3 cm.
Prolonged Instability Prior to a Regime Shift | Science ...
Regime shifts are generally defined as the point of ‘abrupt’ change in the state of a system. However, a seemingly abrupt transition can be the product of a system reorganization that has been ongoing much longer than is evident in statistical analysis of a single component of the system. Using both univariate and multivariate statistical methods, we tested a long-term high-resolution paleoecological dataset with a known change in species assemblage for a regime shift. Analysis of this dataset with Fisher Information and multivariate time series modeling showed that there was a∼2000 year period of instability prior to the regime shift. This period of instability and the subsequent regime shift coincide with regional climate change, indicating that the system is undergoing extrinsic forcing. Paleoecological records offer a unique opportunity to test tools for the detection of thresholds and stable-states, and thus to examine the long-term stability of ecosystems over periods of multiple millennia. This manuscript explores various methods of assessing the transition between alternative states in an ecological system described by a long-term high-resolution paleoecological dataset.
NASA Astrophysics Data System (ADS)
Penland, C.
2017-12-01
One way to test for the linearity of a multivariate system is to perform Linear Inverse Modeling (LIM) to a multivariate time series. LIM yields an estimated operator by combining a lagged covariance matrix with the contemporaneous covariance matrix. If the underlying dynamics is linear, the resulting dynamical description should not depend on the particular lag at which the lagged covariance matrix is estimated. This test is known as the "tau test." The tau test will be severely compromised if the lag at which the analysis is performed is approximately half the period of an internal oscillation frequency. In this case, the tau test will fail even though the dynamics are actually linear. Thus, until now, the tau test has only been possible for lags smaller than this "Nyquist lag." In this poster, we investigate the use of Hilbert transforms as a way to avoid the problems associated with Nyquist lags. By augmenting the data with dimensions orthogonal to those spanning the original system, information that would be inaccessible to LIM in its original form may be sampled.
Risk of portfolio with simulated returns based on copula model
NASA Astrophysics Data System (ADS)
Razak, Ruzanna Ab; Ismail, Noriszura
2015-02-01
The commonly used tool for measuring risk of a portfolio with equally weighted stocks is variance-covariance method. Under extreme circumstances, this method leads to significant underestimation of actual risk due to its multivariate normality assumption of the joint distribution of stocks. The purpose of this research is to compare the actual risk of portfolio with the simulated risk of portfolio in which the joint distribution of two return series is predetermined. The data used is daily stock prices from the ASEAN market for the period January 2000 to December 2012. The copula approach is applied to capture the time varying dependence among the return series. The results shows that the chosen copula families are not suitable to present the dependence structures of each bivariate returns. Exception for the Philippines-Thailand pair where by t copula distribution appears to be the appropriate choice to depict its dependence. Assuming that the t copula distribution is the joint distribution of each paired series, simulated returns is generated and value-at-risk (VaR) is then applied to evaluate the risk of each portfolio consisting of two simulated return series. The VaR estimates was found to be symmetrical due to the simulation of returns via elliptical copula-GARCH approach. By comparison, it is found that the actual risks are underestimated for all pairs of portfolios except for Philippines-Thailand. This study was able to show that disregard of the non-normal dependence structure of two series will result underestimation of actual risk of the portfolio.
NASA Astrophysics Data System (ADS)
O'Brien, S. J.; Fitzpatrick, P. J.; Dzwonkowski, B.; Dykstra, S. L.; Wallace, D. J.; Church, I.; Wiggert, J. D.
2016-02-01
The Mississippi Sound is influenced by a high volume of sediment discharge from the Biloxi River, Mobile Bay via Pas aux Herons, Pascagoula River, Pearl River, Wolf River, and Lake Pontchartrain through the Rigolets. The river discharge, variable wind speed, wind direction and tides have a significant impact on the turbidity and transport of sediments in the Sound. Level 1 Moderate Resolution Imaging Spectroradiometer (MODIS) data is processed to extract the remote sensing reflectance at the wavelength of 645 nm and binned into an 8-day composite at a resolution of 500 m. The study uses a regional ocean color algorithm to compute suspended particulate matter (SPM) concentration based on these 8-day composite images. Multivariate analysis is applied between the SPM and time series of tides, wind, turbidity and river discharge measured at federal and academic institutions' stations and moorings. The multivariate analysis also includes in situ measurements of suspended sediment concentration and advective exchanges through the Mississippi Sound's tidal inlets between the coastal shelf and the nearshore estuarine waters. Mechanisms underlying the observed spatiotemporal distribution of SPM, including material exchange between the Sound and adjacent shelf waters, will be explored. The results of this study will contribute to current understanding of exchange mechanisms and pathways with the Mississippi Bight via the Mississippi Sound's tidal inlets.
Multivariate missing data in hydrology - Review and applications
NASA Astrophysics Data System (ADS)
Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.
2017-12-01
Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
NASA Astrophysics Data System (ADS)
Yang, Eun-Su
2001-07-01
A new statistical approach is used to analyze Dobson Umkehr layer-ozone measurements at Arosa for 1979-1996 and Total Ozone Mapping Spectrometer (TOMS) Version 7 zonal mean ozone for 1979-1993, accounting for stratospheric aerosol optical depth (SAOD), quasi-biennial oscillation (QBO), and solar flux effects. A stepwise regression scheme selects statistically significant periodicities caused by season, SAOD, QBO, and solar variations and filters them out. Auto-regressive (AR) terms are included in ozone residuals and time lags are assumed for the residuals of exogenous variables. Then, the magnitudes of responses of ozone to the SAOD, QBO, and solar index (SI) series are derived from the stationary time series of the residuals. These Multivariate Auto-Regressive Combined Harmonics (MARCH) processes possess the following significant advantages: (1)the ozone trends are estimated more precisely than the previous methods; (2)the influences of the exogenous SAOD, QBO, and solar variations are clearly separated at various time lags; (3)the collinearity of the exogenous variables in the regression is significantly reduced; and (4)the probability of obtaining misleading correlations between ozone and exogenous times series is reduced. The MARCH results indicate that the Umkehr ozone response to SAOD (not a real ozone response but rather an optical interference effect), QBO, and solar effects is driven by combined dynamical radiative-chemical processes. These results are independently confirmed using the revised Standard models that include aerosol and solar forcing mechanisms with all possible time lags but not by the Standard model when restricted to a zero time lag in aerosol and solar ozone forcings. As for Dobson Umkehr ozone measurements at Arosa, the aerosol effects are most significant in layers 8, 7, and 6 with no time lag, as is to be expected due to the optical contamination of Umkehr measurements by SAOD. The QBO and solar UV effects appear in all layers 4-8, and in total ozone. In order to account for annual modulation of the equatorial winds that affects ozone at midlatitudes, a new QBO proxy is selected and applied to the Dobson Umkehr measurements at Arosa. The QBO proxy turns out to be more effective to filter the modulated ozone signals at midlatitudes than the mostly used QBO proxy, the Singapore winds at 30 mb. A statistically significant negative phase relationship is found between solar UV variation and ozone response, especially in layer 4, implying dynamical effects of solar variations on ozone at midlatitudes. Linear negative trends in ozone of -7.8 +/- 1.1 and -5.2 +/- 1.4 [%/decade +/- 2σ] are calculated in layers 7 (~35 km) and 8 (~40 km), respectively, for the period of 1979-1996, with smaller trends of -2.2 +/- 1.0, 1.8 +/- 0.9, and -1.4 +/- 1.1 in layers 6 (~30 km), 5 (~25 km), and 4 (~20 km), respectively. A trend in total ozone (layers 1 through 10) of -2.9 +/- 1.2 [%/decade +/- 2σ] is found over this same period. The aerosol effects obtained from the TOMS zonal means become significant at midlatitudes. QBO ozone contributes to the TOMS zonal means by +/-2 to 4% of their means. The negative solar ozone responses are also found at midlatitudes from the TOMS measurements. The most negative trends from TOMS zonal means are about -6.3 +/- 0.6%/decade at 40-50°N.
Scientific and Engineering Studies; Spectral Estimation.
1977-01-01
Approved for public release; distribution unlimited. TD 5419 FORTRAN PROGRAM FOR MULTIVARIATE LINEAR PREDICTIVE SPECTRAL ANALYSIS, EMPLOYING FORWARD...Time Series Analysis Symposium, Tulsa, Oklahoma, 14-15 May 1976. 1/2 REVERSE BLANK TD 541.9 0. Z o 0 zx .3 z a Z 9-. LU. u ~ .v C3. 4c U -4 :0 z -...0 XZ a Z a.a- :2n 3 TD 5419 4A 0 -. .4 z LL - LL LA. I-. D z q uiL L" LA.. wa Q W w i0 c x -Al 2 0 w x41 Is -4 x . I. x .f < I It I- - -4 U 4 -41-C4
2018-01-01
Many fault detection methods have been proposed for monitoring the health of various industrial systems. Characterizing the monitored signals is a prerequisite for selecting an appropriate detection method. However, fault detection methods tend to be decided with user’s subjective knowledge or their familiarity with the method, rather than following a predefined selection rule. This study investigates the performance sensitivity of two detection methods, with respect to status signal characteristics of given systems: abrupt variance, characteristic indicator, discernable frequency, and discernable index. Relation between key characteristics indicators from four different real-world systems and the performance of two fault detection methods using pattern recognition are evaluated. PMID:29316731
Further summation formulae related to generalized harmonic numbers
NASA Astrophysics Data System (ADS)
Zheng, De-Yin
2007-11-01
By employing the univariate series expansion of classical hypergeometric series formulae, Shen [L.-C. Shen, Remarks on some integrals and series involving the Stirling numbers and [zeta](n), Trans. Amer. Math. Soc. 347 (1995) 1391-1399] and Choi and Srivastava [J. Choi, H.M. Srivastava, Certain classes of infinite series, Monatsh. Math. 127 (1999) 15-25; J. Choi, H.M. Srivastava, Explicit evaluation of Euler and related sums, Ramanujan J. 10 (2005) 51-70] investigated the evaluation of infinite series related to generalized harmonic numbers. More summation formulae have systematically been derived by Chu [W. Chu, Hypergeometric series and the Riemann Zeta function, Acta Arith. 82 (1997) 103-118], who developed fully this approach to the multivariate case. The present paper will explore the hypergeometric series method further and establish numerous summation formulae expressing infinite series related to generalized harmonic numbers in terms of the Riemann Zeta function [zeta](m) with m=5,6,7, including several known ones as examples.
Understanding Human Motion Skill with Peak Timing Synergy
NASA Astrophysics Data System (ADS)
Ueno, Ken; Furukawa, Koichi
The careful observation of motion phenomena is important in understanding the skillful human motion. However, this is a difficult task due to the complexities in timing when dealing with the skilful control of anatomical structures. To investigate the dexterity of human motion, we decided to concentrate on timing with respect to motion, and we have proposed a method to extract the peak timing synergy from multivariate motion data. The peak timing synergy is defined as a frequent ordered graph with time stamps, which has nodes consisting of turning points in motion waveforms. A proposed algorithm, PRESTO automatically extracts the peak timing synergy. PRESTO comprises the following 3 processes: (1) detecting peak sequences with polygonal approximation; (2) generating peak-event sequences; and (3) finding frequent peak-event sequences using a sequential pattern mining method, generalized sequential patterns (GSP). Here, we measured right arm motion during the task of cello bowing and prepared a data set of the right shoulder and arm motion. We successfully extracted the peak timing synergy on cello bowing data set using the PRESTO algorithm, which consisted of common skills among cellists and personal skill differences. To evaluate the sequential pattern mining algorithm GSP in PRESTO, we compared the peak timing synergy by using GSP algorithm and the one by using filtering by reciprocal voting (FRV) algorithm as a non time-series method. We found that the support is 95 - 100% in GSP, while 83 - 96% in FRV and that the results by GSP are better than the one by FRV in the reproducibility of human motion. Therefore we show that sequential pattern mining approach is more effective to extract the peak timing synergy than non-time series analysis approach.
Multivariate studies on the genetics of dermal ridges.
Rostron, J
1977-10-01
In order to investigate the inheritance of the series of ten ridgecounts, factor analysis was used to reduce the dimensionability to two. These two factors are inherited more or less independently and the heritability of the first is 0.97.
Exploring connectivity with large-scale Granger causality on resting-state functional MRI.
DSouza, Adora M; Abidin, Anas Z; Leistritz, Lutz; Wismüller, Axel
2017-08-01
Large-scale Granger causality (lsGC) is a recently developed, resting-state functional MRI (fMRI) connectivity analysis approach that estimates multivariate voxel-resolution connectivity. Unlike most commonly used multivariate approaches, which establish coarse-resolution connectivity by aggregating voxel time-series avoiding an underdetermined problem, lsGC estimates voxel-resolution, fine-grained connectivity by incorporating an embedded dimension reduction. We investigate application of lsGC on realistic fMRI simulations, modeling smoothing of neuronal activity by the hemodynamic response function and repetition time (TR), and empirical resting-state fMRI data. Subsequently, functional subnetworks are extracted from lsGC connectivity measures for both datasets and validated quantitatively. We also provide guidelines to select lsGC free parameters. Results indicate that lsGC reliably recovers underlying network structure with area under receiver operator characteristic curve (AUC) of 0.93 at TR=1.5s for a 10-min session of fMRI simulations. Furthermore, subnetworks of closely interacting modules are recovered from the aforementioned lsGC networks. Results on empirical resting-state fMRI data demonstrate recovery of visual and motor cortex in close agreement with spatial maps obtained from (i) visuo-motor fMRI stimulation task-sequence (Accuracy=0.76) and (ii) independent component analysis (ICA) of resting-state fMRI (Accuracy=0.86). Compared with conventional Granger causality approach (AUC=0.75), lsGC produces better network recovery on fMRI simulations. Furthermore, it cannot recover functional subnetworks from empirical fMRI data, since quantifying voxel-resolution connectivity is not possible as consequence of encountering an underdetermined problem. Functional network recovery from fMRI data suggests that lsGC gives useful insight into connectivity patterns from resting-state fMRI at a multivariate voxel-resolution. Copyright © 2017 Elsevier B.V. All rights reserved.
Luciani, Lorenzo G; Chiodini, Stefano; Donner, Davide; Cai, Tommaso; Vattovani, Valentino; Tiscione, Daniele; Giusti, Guido; Proietti, Silvia; Chierichetti, Franca; Malossini, Gianni
2016-06-01
To measure the early impact of robot-assisted partial nephrectomy (RAPN) on renal function as assessed by renal scan (Tc 99m-DTPA), addressing the issue of risk factors for ischemic damage to the kidney. All patients undergoing RAPN for cT1 renal masses between June 2013 and May 2014 were included in this prospective study. Renal function as expressed by glomerular filtration rate (GFR) was assessed by Technetium 99m-diethylenetriaminepentaacetic acid (Tc 99m-DTPA) renal scan preoperatively and postoperatively at 1 month in every patient. A multivariable analysis was used for the determination of independent factors predictive of GFR decrease of the operated kidney. Overall, 32 patients underwent RAPN in the time interval. Median tumor size, blood loss, and ischemia time were 4 cm, 200 mL, and 24 min, respectively. Two grade III complications occurred (postoperative bleeding in the renal fossa, urinoma). The GFR of the operated kidney decreased significantly from 51.7 ± 15.1 mL/min per 1.73 m(2) preoperatively to 40, 12 ± 12.4 mL/min per 1.73 m(2) 1 month postoperatively (p = 0.001) with a decrease of 22.4 %. On multivariable analysis, only tumor size (p = 0.05) was a predictor of GFR decrease of the operated kidney. Robotic-assisted partial nephrectomy had a detectable impact on early renal function in a series of relatively large tumors and prevailing intermediate nephrometric risk. A mean decrease of 22 % of GFR as assessed by renal scan in the operated kidney was found at 1 month postoperatively. In multivariable analysis, tumor size only was a significant predictor of renal function loss.
Climate Cycles and Forecasts of Cutaneous Leishmaniasis, a Nonstationary Vector-Borne Disease
Chaves, Luis Fernando; Pascual, Mercedes
2006-01-01
Background Cutaneous leishmaniasis (CL) is one of the main emergent diseases in the Americas. As in other vector-transmitted diseases, its transmission is sensitive to the physical environment, but no study has addressed the nonstationary nature of such relationships or the interannual patterns of cycling of the disease. Methods and Findings We studied monthly data, spanning from 1991 to 2001, of CL incidence in Costa Rica using several approaches for nonstationary time series analysis in order to ensure robustness in the description of CL's cycles. Interannual cycles of the disease and the association of these cycles to climate variables were described using frequency and time-frequency techniques for time series analysis. We fitted linear models to the data using climatic predictors, and tested forecasting accuracy for several intervals of time. Forecasts were evaluated using “out of fit” data (i.e., data not used to fit the models). We showed that CL has cycles of approximately 3 y that are coherent with those of temperature and El Niño Southern Oscillation indices (Sea Surface Temperature 4 and Multivariate ENSO Index). Conclusions Linear models using temperature and MEI can predict satisfactorily CL incidence dynamics up to 12 mo ahead, with an accuracy that varies from 72% to 77% depending on prediction time. They clearly outperform simpler models with no climate predictors, a finding that further supports a dynamical link between the disease and climate. PMID:16903778
NASA Astrophysics Data System (ADS)
Flach, Milan; Mahecha, Miguel; Gans, Fabian; Rodner, Erik; Bodesheim, Paul; Guanche-Garcia, Yanira; Brenning, Alexander; Denzler, Joachim; Reichstein, Markus
2016-04-01
The number of available Earth observations (EOs) is currently substantially increasing. Detecting anomalous patterns in these multivariate time series is an important step in identifying changes in the underlying dynamical system. Likewise, data quality issues might result in anomalous multivariate data constellations and have to be identified before corrupting subsequent analyses. In industrial application a common strategy is to monitor production chains with several sensors coupled to some statistical process control (SPC) algorithm. The basic idea is to raise an alarm when these sensor data depict some anomalous pattern according to the SPC, i.e. the production chain is considered 'out of control'. In fact, the industrial applications are conceptually similar to the on-line monitoring of EOs. However, algorithms used in the context of SPC or process monitoring are rarely considered for supervising multivariate spatio-temporal Earth observations. The objective of this study is to exploit the potential and transferability of SPC concepts to Earth system applications. We compare a range of different algorithms typically applied by SPC systems and evaluate their capability to detect e.g. known extreme events in land surface processes. Specifically two main issues are addressed: (1) identifying the most suitable combination of data pre-processing and detection algorithm for a specific type of event and (2) analyzing the limits of the individual approaches with respect to the magnitude, spatio-temporal size of the event as well as the data's signal to noise ratio. Extensive artificial data sets that represent the typical properties of Earth observations are used in this study. Our results show that the majority of the algorithms used can be considered for the detection of multivariate spatiotemporal events and directly transferred to real Earth observation data as currently assembled in different projects at the European scale, e.g. http://baci-h2020.eu/index.php/ and http://earthsystemdatacube.net/. Known anomalies such as the Russian heatwave are detected as well as anomalies which are not detectable with univariate methods.
Gain-scheduling multivariable LPV control of an irrigation canal system.
Bolea, Yolanda; Puig, Vicenç
2016-07-01
The purpose of this paper is to present a multivariable linear parameter varying (LPV) controller with a gain scheduling Smith Predictor (SP) scheme applicable to open-flow canal systems. This LPV controller based on SP is designed taking into account the uncertainty in the estimation of delay and the variation of plant parameters according to the operating point. This new methodology can be applied to a class of delay systems that can be represented by a set of models that can be factorized into a rational multivariable model in series with left/right diagonal (multiple) delays, such as, the case of irrigation canals. A multiple pool canal system is used to test and validate the proposed control approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
A Semi-parametric Multivariate Gap-filling Model for Eddy Covariance Latent Heat Flux
NASA Astrophysics Data System (ADS)
Li, M.; Chen, Y.
2010-12-01
Quantitative descriptions of latent heat fluxes are important to study the water and energy exchanges between terrestrial ecosystems and the atmosphere. The eddy covariance approaches have been recognized as the most reliable technique for measuring surface fluxes over time scales ranging from hours to years. However, unfavorable micrometeorological conditions, instrument failures, and applicable measurement limitations may cause inevitable flux gaps in time series data. Development and application of suitable gap-filling techniques are crucial to estimate long term fluxes. In this study, a semi-parametric multivariate gap-filling model was developed to fill latent heat flux gaps for eddy covariance measurements. Our approach combines the advantages of a multivariate statistical analysis (principal component analysis, PCA) and a nonlinear interpolation technique (K-nearest-neighbors, KNN). The PCA method was first used to resolve the multicollinearity relationships among various hydrometeorological factors, such as radiation, soil moisture deficit, LAI, and wind speed. The KNN method was then applied as a nonlinear interpolation tool to estimate the flux gaps as the weighted sum latent heat fluxes with the K-nearest distances in the PCs’ domain. Two years, 2008 and 2009, of eddy covariance and hydrometeorological data from a subtropical mixed evergreen forest (the Lien-Hua-Chih Site) were collected to calibrate and validate the proposed approach with artificial gaps after standard QC/QA procedures. The optimal K values and weighting factors were determined by the maximum likelihood test. The results of gap-filled latent heat fluxes conclude that developed model successful preserving energy balances of daily, monthly, and yearly time scales. Annual amounts of evapotranspiration from this study forest were 747 mm and 708 mm for 2008 and 2009, respectively. Nocturnal evapotranspiration was estimated with filled gaps and results are comparable with other studies. Seasonal and daily variability of latent heat fluxes were also discussed.
Heymann, S J; Toomey, S; Furstenberg, F
1999-08-01
A series of studies has demonstrated that sick children fare better when their parents are present. To examine working conditions that determine whether parents can spend time with and become involved in the care of their children when they are sick. Survey with a multivariate analysis of factors influencing parental care of sick children. Mixed-income urban working parents aged 26 to 29 years participating in the Baltimore Parenthood Study. Only 42% of working parents in our sample cared for their young children when they became sick. A multivariate logistic regression analysis was conducted to predict which parents stayed at home when their children were sick. Those parents who had either paid sick or vacation leave were 5.2 times as likely to care for their children themselves when they were sick. Of parents with less than a high school education, 17% received paid leave, compared with 57% of parents with a general equivalency diploma, 76% of parents with a high school diploma, and 92% of parents with more than a high school education (P<.001). The finding that many parents were unable to care for their sick children themselves is important for pediatric care. While low-income children are more likely to face marked health problems and to be in need of parental care, they are more likely to live in households in which parents lack paid leave and cannot afford to take unpaid leave.
Modarres, Reza; Ouarda, Taha B M J; Vanasse, Alain; Orzanco, Maria Gabriela; Gosselin, Pierre
2014-07-01
Changes in extreme meteorological variables and the demographic shift towards an older population have made it important to investigate the association of climate variables and hip fracture by advanced methods in order to determine the climate variables that most affect hip fracture incidence. The nonlinear autoregressive moving average with exogenous variable-generalized autoregressive conditional heteroscedasticity (ARMAX-GARCH) and multivariate GARCH (MGARCH) time series approaches were applied to investigate the nonlinear association between hip fracture rate in female and male patients aged 40-74 and 75+ years and climate variables in the period of 1993-2004, in Montreal, Canada. The models describe 50-56% of daily variation in hip fracture rate and identify snow depth, air temperature, day length and air pressure as the influencing variables on the time-varying mean and variance of the hip fracture rate. The conditional covariance between climate variables and hip fracture rate is increasing exponentially, showing that the effect of climate variables on hip fracture rate is most acute when rates are high and climate conditions are at their worst. In Montreal, climate variables, particularly snow depth and air temperature, appear to be important predictors of hip fracture incidence. The association of climate variables and hip fracture does not seem to change linearly with time, but increases exponentially under harsh climate conditions. The results of this study can be used to provide an adaptive climate-related public health program and ti guide allocation of services for avoiding hip fracture risk.
Modeling rainfall-runoff relationship using multivariate GARCH model
NASA Astrophysics Data System (ADS)
Modarres, R.; Ouarda, T. B. M. J.
2013-08-01
The traditional hydrologic time series approaches are used for modeling, simulating and forecasting conditional mean of hydrologic variables but neglect their time varying variance or the second order moment. This paper introduces the multivariate Generalized Autoregressive Conditional Heteroscedasticity (MGARCH) modeling approach to show how the variance-covariance relationship between hydrologic variables varies in time. These approaches are also useful to estimate the dynamic conditional correlation between hydrologic variables. To illustrate the novelty and usefulness of MGARCH models in hydrology, two major types of MGARCH models, the bivariate diagonal VECH and constant conditional correlation (CCC) models are applied to show the variance-covariance structure and cdynamic correlation in a rainfall-runoff process. The bivariate diagonal VECH-GARCH(1,1) and CCC-GARCH(1,1) models indicated both short-run and long-run persistency in the conditional variance-covariance matrix of the rainfall-runoff process. The conditional variance of rainfall appears to have a stronger persistency, especially long-run persistency, than the conditional variance of streamflow which shows a short-lived drastic increasing pattern and a stronger short-run persistency. The conditional covariance and conditional correlation coefficients have different features for each bivariate rainfall-runoff process with different degrees of stationarity and dynamic nonlinearity. The spatial and temporal pattern of variance-covariance features may reflect the signature of different physical and hydrological variables such as drainage area, topography, soil moisture and ground water fluctuations on the strength, stationarity and nonlinearity of the conditional variance-covariance for a rainfall-runoff process.
NASA Astrophysics Data System (ADS)
Modarres, Reza; Ouarda, Taha B. M. J.; Vanasse, Alain; Orzanco, Maria Gabriela; Gosselin, Pierre
2014-07-01
Changes in extreme meteorological variables and the demographic shift towards an older population have made it important to investigate the association of climate variables and hip fracture by advanced methods in order to determine the climate variables that most affect hip fracture incidence. The nonlinear autoregressive moving average with exogenous variable-generalized autoregressive conditional heteroscedasticity (ARMA X-GARCH) and multivariate GARCH (MGARCH) time series approaches were applied to investigate the nonlinear association between hip fracture rate in female and male patients aged 40-74 and 75+ years and climate variables in the period of 1993-2004, in Montreal, Canada. The models describe 50-56 % of daily variation in hip fracture rate and identify snow depth, air temperature, day length and air pressure as the influencing variables on the time-varying mean and variance of the hip fracture rate. The conditional covariance between climate variables and hip fracture rate is increasing exponentially, showing that the effect of climate variables on hip fracture rate is most acute when rates are high and climate conditions are at their worst. In Montreal, climate variables, particularly snow depth and air temperature, appear to be important predictors of hip fracture incidence. The association of climate variables and hip fracture does not seem to change linearly with time, but increases exponentially under harsh climate conditions. The results of this study can be used to provide an adaptive climate-related public health program and ti guide allocation of services for avoiding hip fracture risk.
Exploratory Long-Range Models to Estimate Summer Climate Variability over Southern Africa.
NASA Astrophysics Data System (ADS)
Jury, Mark R.; Mulenga, Henry M.; Mason, Simon J.
1999-07-01
Teleconnection predictors are explored using multivariate regression models in an effort to estimate southern African summer rainfall and climate impacts one season in advance. The preliminary statistical formulations include many variables influenced by the El Niño-Southern Oscillation (ENSO) such as tropical sea surface temperatures (SST) in the Indian and Atlantic Oceans. Atmospheric circulation responses to ENSO include the alternation of tropical zonal winds over Africa and changes in convective activity within oceanic monsoon troughs. Numerous hemispheric-scale datasets are employed to extract predictors and include global indexes (Southern Oscillation index and quasi-biennial oscillation), SST principal component scores for the global oceans, indexes of tropical convection (outgoing longwave radiation), air pressure, and surface and upper winds over the Indian and Atlantic Oceans. Climatic targets include subseasonal, area-averaged rainfall over South Africa and the Zambezi river basin, and South Africa's annual maize yield. Predictors and targets overlap in the years 1971-93, the defined training period. Each target time series is fitted by an optimum group of predictors from the preceding spring, in a linear multivariate formulation. To limit artificial skill, predictors are restricted to three, providing 17 degrees of freedom. Models with colinear predictors are screened out, and persistence of the target time series is considered. The late summer rainfall models achieve a mean r2 fit of 72%, contributed largely through ENSO modulation. Early summer rainfall cross validation correlations are lower (61%). A conceptual understanding of the climate dynamics and ocean-atmosphere coupling processes inherent in the exploratory models is outlined.Seasonal outlooks based on the exploratory models could help mitigate the impacts of southern Africa's fluctuating climate. It is believed that an advance warning of drought risk and seasonal rainfall prospects will improve the economic growth potential of southern Africa and provide additional security for food and water supplies.
NASA Astrophysics Data System (ADS)
Kousari, Mohammad Reza; Hosseini, Mitra Esmaeilzadeh; Ahani, Hossein; Hakimelahi, Hemila
2017-01-01
An effective forecast of the drought definitely gives lots of advantages in regard to the management of water resources being used in agriculture, industry, and households consumption. To introduce such a model applying simple data inputs, in this study a regional drought forecast method on the basis of artificial intelligence capabilities (artificial neural networks) and Standardized Precipitation Index (SPI in 3, 6, 9, 12, 18, and 24 monthly series) has been presented in Fars Province of Iran. The precipitation data of 41 rain gauge stations were applied for computing SPI values. Besides, weather signals including Multivariate ENSO Index (MEI), North Atlantic Oscillation (NAO), Southern Oscillation Index (SOI), NINO1+2, anomaly NINO1+2, NINO3, anomaly NINO3, NINO4, anomaly NINO4, NINO3.4, and anomaly NINO3.4 were also used as the predictor variables for SPI time series forecast the next 12 months. Frequent testing and validating steps were considered to obtain the best artificial neural networks (ANNs) models. The forecasted values were mapped in verification sector then they were compared with the observed maps at the same dates. Results showed considerable spatial and temporal relationships even among the maps of different SPI time series. Also, the first 6 months forecasted maps showed an average of 73 % agreements with the observed ones. The most important finding and the strong point of this study was the fact that although drought forecast in each station and time series was completely independent, the relationships between spatial and temporal predictions remained. This strong point mainly referred to frequent testing and validating steps in order to explore the best drought forecast models from plenty of produced ANNs models. Finally, wherever the precipitation data are available, the practical application of the presented method is possible.
Generalized multiplicative error models: Asymptotic inference and empirical analysis
NASA Astrophysics Data System (ADS)
Li, Qian
This dissertation consists of two parts. The first part focuses on extended Multiplicative Error Models (MEM) that include two extreme cases for nonnegative series. These extreme cases are common phenomena in high-frequency financial time series. The Location MEM(p,q) model incorporates a location parameter so that the series are required to have positive lower bounds. The estimator for the location parameter turns out to be the minimum of all the observations and is shown to be consistent. The second case captures the nontrivial fraction of zero outcomes feature in a series and combines a so-called Zero-Augmented general F distribution with linear MEM(p,q). Under certain strict stationary and moment conditions, we establish a consistency and asymptotic normality of the semiparametric estimation for these two new models. The second part of this dissertation examines the differences and similarities between trades in the home market and trades in the foreign market of cross-listed stocks. We exploit the multiplicative framework to model trading duration, volume per trade and price volatility for Canadian shares that are cross-listed in the New York Stock Exchange (NYSE) and the Toronto Stock Exchange (TSX). We explore the clustering effect, interaction between trading variables, and the time needed for price equilibrium after a perturbation for each market. The clustering effect is studied through the use of univariate MEM(1,1) on each variable, while the interactions among duration, volume and price volatility are captured by a multivariate system of MEM(p,q). After estimating these models by a standard QMLE procedure, we exploit the Impulse Response function to compute the calendar time for a perturbation in these variables to be absorbed into price variance, and use common statistical tests to identify the difference between the two markets in each aspect. These differences are of considerable interest to traders, stock exchanges and policy makers.
Generating synthetic wave climates for coastal modelling: a linear mixed modelling approach
NASA Astrophysics Data System (ADS)
Thomas, C.; Lark, R. M.
2013-12-01
Numerical coastline morphological evolution models require wave climate properties to drive morphological change through time. Wave climate properties (typically wave height, period and direction) may be temporally fixed, culled from real wave buoy data, or allowed to vary in some way defined by a Gaussian or other pdf. However, to examine sensitivity of coastline morphologies to wave climate change, it seems desirable to be able to modify wave climate time series from a current to some new state along a trajectory, but in a way consistent with, or initially conditioned by, the properties of existing data, or to generate fully synthetic data sets with realistic time series properties. For example, mean or significant wave height time series may have underlying periodicities, as revealed in numerous analyses of wave data. Our motivation is to develop a simple methodology to generate synthetic wave climate time series that can change in some stochastic way through time. We wish to use such time series in a coastline evolution model to test sensitivities of coastal landforms to changes in wave climate over decadal and centennial scales. We have worked initially on time series of significant wave height, based on data from a Waverider III buoy located off the coast of Yorkshire, England. The statistical framework for the simulation is the linear mixed model. The target variable, perhaps after transformation (Box-Cox), is modelled as a multivariate Gaussian, the mean modelled as a function of a fixed effect, and two random components, one of which is independently and identically distributed (iid) and the second of which is temporally correlated. The model was fitted to the data by likelihood methods. We considered the option of a periodic mean, the period either fixed (e.g. at 12 months) or estimated from the data. We considered two possible correlation structures for the second random effect. In one the correlation decays exponentially with time. In the second (spherical) model, it cuts off at a temporal range. Having fitted the model, multiple realisations were generated; the random effects were simulated by specifying a covariance matrix for the simulated values, with the estimated parameters. The Cholesky factorisation of the covariance matrix was computed and realizations of the random component of the model generated by pre-multiplying a vector of iid standard Gaussian variables by the lower triangular factor. The resulting random variate was added to the mean value computed from the fixed effects, and the result back-transformed to the original scale of the measurement. Realistic simulations result from approach described above. Background exploratory data analysis was undertaken on 20-day sets of 30-minute buoy data, selected from days 5-24 of months January, April, July, October, 2011, to elucidate daily to weekly variations, and to keep numerical analysis tractable computationally. Work remains to be undertaken to develop suitable models for synthetic directional data. We suggest that the general principles of the method will have applications in other geomorphological modelling endeavours requiring time series of stochastically variable environmental parameters.
Linking the Weather Generator with Regional Climate Model
NASA Astrophysics Data System (ADS)
Dubrovsky, Martin; Farda, Ales; Skalak, Petr; Huth, Radan
2013-04-01
One of the downscaling approaches, which transform the raw outputs from the climate models (GCMs or RCMs) into data with more realistic structure, is based on linking the stochastic weather generator with the climate model output. The present contribution, in which the parametric daily surface weather generator (WG) M&Rfi is linked to the RCM output, follows two aims: (1) Validation of the new simulations of the present climate (1961-1990) made by the ALADIN-Climate Regional Climate Model at 25 km resolution. The WG parameters are derived from the RCM-simulated surface weather series and compared to those derived from weather series observed in 125 Czech meteorological stations. The set of WG parameters will include statistics of the surface temperature and precipitation series (including probability of wet day occurrence). (2) Presenting a methodology for linking the WG with RCM output. This methodology, which is based on merging information from observations and RCM, may be interpreted as a downscaling procedure, whose product is a gridded WG capable of producing realistic synthetic multivariate weather series for weather-ungauged locations. In this procedure, WG is calibrated with RCM-simulated multi-variate weather series in the first step, and the grid specific WG parameters are then de-biased by spatially interpolated correction factors based on comparison of WG parameters calibrated with gridded RCM weather series and spatially scarcer observations. The quality of the weather series produced by the resultant gridded WG will be assessed in terms of selected climatic characteristics (focusing on characteristics related to variability and extremes of surface temperature and precipitation). Acknowledgements: The present experiment is made within the frame of projects ALARO-Climate (project P209/11/2405 sponsored by the Czech Science Foundation), WG4VALUE (project LD12029 sponsored by the Ministry of Education, Youth and Sports of CR) and VALUE (COST ES 1102 action).
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
How does spatial extent of fMRI datasets affect independent component analysis decomposition?
Aragri, Adriana; Scarabino, Tommaso; Seifritz, Erich; Comani, Silvia; Cirillo, Sossio; Tedeschi, Gioacchino; Esposito, Fabrizio; Di Salle, Francesco
2006-09-01
Spatial independent component analysis (sICA) of functional magnetic resonance imaging (fMRI) time series can generate meaningful activation maps and associated descriptive signals, which are useful to evaluate datasets of the entire brain or selected portions of it. Besides computational implications, variations in the input dataset combined with the multivariate nature of ICA may lead to different spatial or temporal readouts of brain activation phenomena. By reducing and increasing a volume of interest (VOI), we applied sICA to different datasets from real activation experiments with multislice acquisition and single or multiple sensory-motor task-induced blood oxygenation level-dependent (BOLD) signal sources with different spatial and temporal structure. Using receiver operating characteristics (ROC) methodology for accuracy evaluation and multiple regression analysis as benchmark, we compared sICA decompositions of reduced and increased VOI fMRI time-series containing auditory, motor and hemifield visual activation occurring separately or simultaneously in time. Both approaches yielded valid results; however, the results of the increased VOI approach were spatially more accurate compared to the results of the decreased VOI approach. This is consistent with the capability of sICA to take advantage of extended samples of statistical observations and suggests that sICA is more powerful with extended rather than reduced VOI datasets to delineate brain activity. (c) 2006 Wiley-Liss, Inc.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Improvement on Exoplanet Detection Methods and Analysis via Gaussian Process Fitting Techniques
NASA Astrophysics Data System (ADS)
Van Ross, Bryce; Teske, Johanna
2018-01-01
Planetary signals in radial velocity (RV) data are often accompanied by signals coming solely from stellar photo- or chromospheric variation. Such variation can reduce the precision of planet detection and mass measurements, and cause misidentification of planetary signals. Recently, several authors have demonstrated the utility of Gaussian Process (GP) regression for disentangling planetary signals in RV observations (Aigrain et al. 2012; Angus et al. 2017; Czekala et al. 2017; Faria et al. 2016; Gregory 2015; Haywood et al. 2014; Rajpaul et al. 2015; Foreman-Mackey et al. 2017). GP models the covariance of multivariate data to make predictions about likely underlying trends in the data, which can be applied to regions where there are no existing observations. The potency of GP has been used to infer stellar rotation periods; to model and disentangle time series spectra; and to determine physical aspects, populations, and detection of exoplanets, among other astrophysical applications. Here, we implement similar analysis techniques to times series of the Ca-2 H and K activity indicator measured simultaneously with RVs in a small sample of stars from the large Keck/HIRES RV planet search program. Our goal is to characterize the pattern(s) of non-planetary variation to be able to know what is/ is not a planetary signal. We investigated ten different GP kernels and their respective hyperparameters to determine the optimal combination (e.g., the lowest Bayesian Information Criterion value) in each stellar data set. To assess the hyperparameters’ error, we sampled their posterior distributions using Markov chain Monte Carlo (MCMC) analysis on the optimized kernels. Our results demonstrate how GP analysis of stellar activity indicators alone can contribute to exoplanet detection in RV data, and highlight the challenges in applying GP analysis to relatively small, irregularly sampled time series.
Detrended fluctuation analysis made flexible to detect range of cross-correlated fluctuations
NASA Astrophysics Data System (ADS)
Kwapień, Jarosław; Oświecimka, Paweł; DroŻdŻ, Stanisław
2015-11-01
The detrended cross-correlation coefficient ρDCCA has recently been proposed to quantify the strength of cross-correlations on different temporal scales in bivariate, nonstationary time series. It is based on the detrended cross-correlation and detrended fluctuation analyses (DCCA and DFA, respectively) and can be viewed as an analog of the Pearson coefficient in the case of the fluctuation analysis. The coefficient ρDCCA works well in many practical situations but by construction its applicability is limited to detection of whether two signals are generally cross-correlated, without the possibility to obtain information on the amplitude of fluctuations that are responsible for those cross-correlations. In order to introduce some related flexibility, here we propose an extension of ρDCCA that exploits the multifractal versions of DFA and DCCA: multifractal detrended fluctuation analysis and multifractal detrended cross-correlation analysis, respectively. The resulting new coefficient ρq not only is able to quantify the strength of correlations but also allows one to identify the range of detrended fluctuation amplitudes that are correlated in two signals under study. We show how the coefficient ρq works in practical situations by applying it to stochastic time series representing processes with long memory: autoregressive and multiplicative ones. Such processes are often used to model signals recorded from complex systems and complex physical phenomena like turbulence, so we are convinced that this new measure can successfully be applied in time-series analysis. In particular, we present an example of such application to highly complex empirical data from financial markets. The present formulation can straightforwardly be extended to multivariate data in terms of the q -dependent counterpart of the correlation matrices and then to the network representation.
Understanding Information Flow Interaction along Separable Causal Paths in Environmental Signals
NASA Astrophysics Data System (ADS)
Jiang, P.; Kumar, P.
2017-12-01
Multivariate environmental signals reflect the outcome of complex inter-dependencies, such as those in ecohydrologic systems. Transfer entropy and information partitioning approaches have been used to characterize such dependencies. However, these approaches capture net information flow occurring through a multitude of pathways involved in the interaction and as a result mask our ability to discern the causal interaction within an interested subsystem through specific pathways. We build on recent developments of momentary information transfer along causal paths proposed by Runge [2015] to develop a framework for quantifying information decomposition along separable causal paths. Momentary information transfer along causal paths captures the amount of information flow between any two variables lagged at two specific points in time. Our approach expands this concept to characterize the causal interaction in terms of synergistic, unique and redundant information flow through separable causal paths. Multivariate analysis using this novel approach reveals precise understanding of causality and feedback. We illustrate our approach with synthetic and observed time series data. We believe the proposed framework helps better delineate the internal structure of complex systems in geoscience where huge amounts of observational datasets exist, and it will also help the modeling community by providing a new way to look at the complexity of real and modeled systems. Runge, Jakob. "Quantifying information transfer and mediation along causal pathways in complex systems." Physical Review E 92.6 (2015): 062829.
Online analysis: Deeper insights into water quality dynamics in spring water.
Page, Rebecca M; Besmer, Michael D; Epting, Jannis; Sigrist, Jürg A; Hammes, Frederik; Huggenberger, Peter
2017-12-01
We have studied the dynamics of water quality in three karst springs taking advantage of new technological developments that enable high-resolution measurements of bacterial load (total cell concentration: TCC) as well as online measurements of abiotic parameters. We developed a novel data analysis approach, using self-organizing maps and non-linear projection methods, to approximate the TCC dynamics using the multivariate data sets of abiotic parameter time-series, thus providing a method that could be implemented in an online water quality management system for water suppliers. The (TCC) data, obtained over several months, provided a good basis to study the microbiological dynamics in detail. Alongside the TCC measurements, online abiotic parameter time-series, including spring discharge, turbidity, spectral absorption coefficient at 254nm (SAC254) and electrical conductivity, were obtained. High-density sampling over an extended period of time, i.e. every 45min for 3months, allowed a detailed analysis of the dynamics in karst spring water quality. Substantial increases in both the TCC and the abiotic parameters followed precipitation events in the catchment area. Differences between the parameter fluctuations were only apparent when analyzed at a high temporal scale. Spring discharge was always the first to react to precipitation events in the catchment area. Lag times between the onset of precipitation and a change in discharge varied between 0.2 and 6.7h, depending on the spring and event. TCC mostly reacted second or approximately concurrent with turbidity and SAC254, whereby the fastest observed reaction in the TCC time series occurred after 2.3h. The methodological approach described here enables a better understanding of bacterial dynamics in karst springs, which can be used to estimate risks and management options to avoid contamination of the drinking water. Copyright © 2017 Elsevier B.V. All rights reserved.
Scaling symmetry, renormalization, and time series modeling: the case of financial assets dynamics.
Zamparo, Marco; Baldovin, Fulvio; Caraglio, Michele; Stella, Attilio L
2013-12-01
We present and discuss a stochastic model of financial assets dynamics based on the idea of an inverse renormalization group strategy. With this strategy we construct the multivariate distributions of elementary returns based on the scaling with time of the probability density of their aggregates. In its simplest version the model is the product of an endogenous autoregressive component and a random rescaling factor designed to embody also exogenous influences. Mathematical properties like increments' stationarity and ergodicity can be proven. Thanks to the relatively low number of parameters, model calibration can be conveniently based on a method of moments, as exemplified in the case of historical data of the S&P500 index. The calibrated model accounts very well for many stylized facts, like volatility clustering, power-law decay of the volatility autocorrelation function, and multiscaling with time of the aggregated return distribution. In agreement with empirical evidence in finance, the dynamics is not invariant under time reversal, and, with suitable generalizations, skewness of the return distribution and leverage effects can be included. The analytical tractability of the model opens interesting perspectives for applications, for instance, in terms of obtaining closed formulas for derivative pricing. Further important features are the possibility of making contact, in certain limits, with autoregressive models widely used in finance and the possibility of partially resolving the long- and short-memory components of the volatility, with consistent results when applied to historical series.
The influence of labor market changes on first-time medical school applicant pools.
Cort, David A; Morrison, Emory
2014-12-01
To explore whether the number and composition of first-time applicants to U.S. MD-granting medical schools, which have fluctuated over the past 30 years, are related to changes in labor market strength, specifically the unemployment rate and wages. The authors merged time series data from 1980 through 2010 (inclusive) from five sources and used multivariate time series models to determine whether changes in labor market strength (and several other macro-level factors) were related to the number of the medical school applicants as reported by the American Medical College Application Service. Analyses were replicated across specific sex and race/ethnicity applicant pools. Two results surfaced in the analyses. First, the strength of the labor market was not influential in explaining changes in applicant pool sizes for all applicants, but was strongly influential in explaining changes for black and Hispanic males. Increases of $1,000 in prevailing median wages produced a 1.6% decrease in the white male applicant pool, while 1% increases in the unemployment rate were associated with 4.5% and 3.1% increases in, respectively, the black and Hispanic male applicant pools. Second, labor market strength was a more important determinant in applications from males than in applications from females. Although stakeholders cannot directly influence the overall economic market, they can plan and prepare for fewer applications from males, especially those who are black and Hispanic, when the labor market is strong.
Scaling symmetry, renormalization, and time series modeling: The case of financial assets dynamics
NASA Astrophysics Data System (ADS)
Zamparo, Marco; Baldovin, Fulvio; Caraglio, Michele; Stella, Attilio L.
2013-12-01
We present and discuss a stochastic model of financial assets dynamics based on the idea of an inverse renormalization group strategy. With this strategy we construct the multivariate distributions of elementary returns based on the scaling with time of the probability density of their aggregates. In its simplest version the model is the product of an endogenous autoregressive component and a random rescaling factor designed to embody also exogenous influences. Mathematical properties like increments’ stationarity and ergodicity can be proven. Thanks to the relatively low number of parameters, model calibration can be conveniently based on a method of moments, as exemplified in the case of historical data of the S&P500 index. The calibrated model accounts very well for many stylized facts, like volatility clustering, power-law decay of the volatility autocorrelation function, and multiscaling with time of the aggregated return distribution. In agreement with empirical evidence in finance, the dynamics is not invariant under time reversal, and, with suitable generalizations, skewness of the return distribution and leverage effects can be included. The analytical tractability of the model opens interesting perspectives for applications, for instance, in terms of obtaining closed formulas for derivative pricing. Further important features are the possibility of making contact, in certain limits, with autoregressive models widely used in finance and the possibility of partially resolving the long- and short-memory components of the volatility, with consistent results when applied to historical series.
Shuttle Data Center File-Processing Tool in Java
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Miller, Walter H.
2006-01-01
A Java-language computer program has been written to facilitate mining of data in files in the Shuttle Data Center (SDC) archives. This program can be executed on a variety of workstations or via Web-browser programs. This program is partly similar to prior C-language programs used for the same purpose, while differing from those programs in that it exploits the platform-neutrality of Java in implementing several features that are important for analysis of large sets of time-series data. The program supports regular expression queries of SDC archive files, reads the files, interleaves the time-stamped samples according to a chosen output, then transforms the results into that format. A user can choose among a variety of output file formats that are useful for diverse purposes, including plotting, Markov modeling, multivariate density estimation, and wavelet multiresolution analysis, as well as for playback of data in support of simulation and testing.
Ichikawa, Nobuki; Homma, Shigenori; Yoshida, Tadashi; Ohno, Yosuke; Kawamura, Hideki; Wakizaka, Kazuki; Nakanishi, Kazuaki; Kazui, Keizo; Iijima, Hiroaki; Shomura, Hiroki; Funakoshi, Tohru; Nakano, Shiro; Taketomi, Akinobu
2017-12-01
We retrospectively assessed the efficacy of our mentor tutoring system for teaching laparoscopic colorectal surgical skills in a general hospital. A series of 55 laparoscopic colectomies performed by 1 trainee were evaluated. Next, the learning curves for high anterior resection performed by the trainee (n=20) were compared with those of a self-trained surgeon (n=19). Cumulative sum analysis and multivariate regression analyses showed that 38 completed cases were needed to reduce the operative time. In high anterior resection, the mean operative times were significantly shorter after the seventh average for the tutored surgeon compared with that for the self-trained surgeon. In cumulative sum charting, the curve reached a plateau by the seventh case for the tutored surgeon, but continued to increase for the self-trained surgeon. Mentor tutoring effectively teaches laparoscopic colorectal surgical skills in a general hospital setting.
Data-driven Analysis and Prediction of Arctic Sea Ice
NASA Astrophysics Data System (ADS)
Kondrashov, D. A.; Chekroun, M.; Ghil, M.; Yuan, X.; Ting, M.
2015-12-01
We present results of data-driven predictive analyses of sea ice over the main Arctic regions. Our approach relies on the Multilayer Stochastic Modeling (MSM) framework of Kondrashov, Chekroun and Ghil [Physica D, 2015] and it leads to prognostic models of sea ice concentration (SIC) anomalies on seasonal time scales.This approach is applied to monthly time series of leading principal components from the multivariate Empirical Orthogonal Function decomposition of SIC and selected climate variables over the Arctic. We evaluate the predictive skill of MSM models by performing retrospective forecasts with "no-look ahead" forup to 6-months ahead. It will be shown in particular that the memory effects included in our non-Markovian linear MSM models improve predictions of large-amplitude SIC anomalies in certain Arctic regions. Furtherimprovements allowed by the MSM framework will adopt a nonlinear formulation, as well as alternative data-adaptive decompositions.
Gatti, Giulia; Bianchi, Carlo Nike; Montefalcone, Monica; Venturini, Sara; Diviacco, Giovanni; Morri, Carla
2017-01-15
The dearth of long-time series hampers the measurement of the ecosystem change that followed the global marine climate shift of the 1980-90s. The sessile communities of Portofino Promontory reefs (Ligurian Sea, NW Mediterranean) have been discontinuously studied since the 1950s. Collating information from various sources, three periods of investigations have been distinguished: 1) 1950-70s; 2) 1980-90s; 3) 2000-10s. A cooler phase in time 1 was followed by a rapid warming in time 2, to stabilize at about 0.5°C higher in time 3. Human pressure grew impressively, especially after the establishment of a MPA in 1999. Multivariate analyses evidenced a major change of community composition in time 2. Some species disappeared or got rarer, many found refuge at depth, and among the newcomers there were recently introduced alien species. This study demonstrated the importance of descriptive historical data to understand magnitude and pattern of change in the long term evolution of marine ecosystems. Copyright © 2016 Elsevier Ltd. All rights reserved.
Contini, Erika W; Wardle, Susan G; Carlson, Thomas A
2017-10-01
Visual object recognition is a complex, dynamic process. Multivariate pattern analysis methods, such as decoding, have begun to reveal how the brain processes complex visual information. Recently, temporal decoding methods for EEG and MEG have offered the potential to evaluate the temporal dynamics of object recognition. Here we review the contribution of M/EEG time-series decoding methods to understanding visual object recognition in the human brain. Consistent with the current understanding of the visual processing hierarchy, low-level visual features dominate decodable object representations early in the time-course, with more abstract representations related to object category emerging later. A key finding is that the time-course of object processing is highly dynamic and rapidly evolving, with limited temporal generalisation of decodable information. Several studies have examined the emergence of object category structure, and we consider to what degree category decoding can be explained by sensitivity to low-level visual features. Finally, we evaluate recent work attempting to link human behaviour to the neural time-course of object processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Estimating residential price elasticity of demand for water: A contingent valuation approach
NASA Astrophysics Data System (ADS)
Thomas, John F.; Syme, Geoffrey J.
1988-11-01
Residential households in Perth, Western Australia have access to privately extracted groundwater as well as a public mains water supply, which has been charged through a two-part block tariff. A contingent valuation approach is developed to estimate price elasticity of demand for public supply. Results are compared with those of a multivariate time series analysis. Validation tests for the contingent approach are proposed, based on a comparison of predicted behaviors following hypothesised price changes with relevant independent data. Properly conducted, the contingent approach appears to be reliable, applicable where the available data do not favor regression analysis, and a fruitful source of information about social, technical, and behavioral responses to change in the price of water.
Dynamics and complexity of body temperature in preterm infants nursed in incubators
Jost, Kerstin; Pramana, Isabelle; Delgado-Eckert, Edgar; Kumar, Nitin; Datta, Alexandre N.; Frey, Urs; Schulzke, Sven M.
2017-01-01
Background Poor control of body temperature is associated with mortality and major morbidity in preterm infants. We aimed to quantify its dynamics and complexity to evaluate whether indices from fluctuation analyses of temperature time series obtained within the first five days of life are associated with gestational age (GA) and body size at birth, and presence and severity of typical comorbidities of preterm birth. Methods We recorded 3h-time series of body temperature using a skin electrode in incubator-nursed preterm infants. We calculated mean and coefficient of variation of body temperature, scaling exponent alpha (Talpha) derived from detrended fluctuation analysis, and sample entropy (TSampEn) of temperature fluctuations. Data were analysed by multilevel multivariable linear regression. Results Data of satisfactory technical quality were obtained from 285/357 measurements (80%) in 73/90 infants (81%) with a mean (range) GA of 30.1 (24.0–34.0) weeks. We found a positive association of Talpha with increasing levels of respiratory support after adjusting for GA and birth weight z-score (p<0.001; R2 = 0.38). Conclusion Dynamics and complexity of body temperature in incubator-nursed preterm infants show considerable associations with GA and respiratory morbidity. Talpha may be a useful marker of autonomic maturity and severity of disease in preterm infants. PMID:28448569
Dynamics and complexity of body temperature in preterm infants nursed in incubators.
Jost, Kerstin; Pramana, Isabelle; Delgado-Eckert, Edgar; Kumar, Nitin; Datta, Alexandre N; Frey, Urs; Schulzke, Sven M
2017-01-01
Poor control of body temperature is associated with mortality and major morbidity in preterm infants. We aimed to quantify its dynamics and complexity to evaluate whether indices from fluctuation analyses of temperature time series obtained within the first five days of life are associated with gestational age (GA) and body size at birth, and presence and severity of typical comorbidities of preterm birth. We recorded 3h-time series of body temperature using a skin electrode in incubator-nursed preterm infants. We calculated mean and coefficient of variation of body temperature, scaling exponent alpha (Talpha) derived from detrended fluctuation analysis, and sample entropy (TSampEn) of temperature fluctuations. Data were analysed by multilevel multivariable linear regression. Data of satisfactory technical quality were obtained from 285/357 measurements (80%) in 73/90 infants (81%) with a mean (range) GA of 30.1 (24.0-34.0) weeks. We found a positive association of Talpha with increasing levels of respiratory support after adjusting for GA and birth weight z-score (p<0.001; R2 = 0.38). Dynamics and complexity of body temperature in incubator-nursed preterm infants show considerable associations with GA and respiratory morbidity. Talpha may be a useful marker of autonomic maturity and severity of disease in preterm infants.
Fenn, Daniel J; Porter, Mason A; McDonald, Mark; Williams, Stacy; Johnson, Neil F; Jones, Nick S
2009-09-01
We study the cluster dynamics of multichannel (multivariate) time series by representing their correlations as time-dependent networks and investigating the evolution of network communities. We employ a node-centric approach that allows us to track the effects of the community evolution on the functional roles of individual nodes without having to track entire communities. As an example, we consider a foreign exchange market network in which each node represents an exchange rate and each edge represents a time-dependent correlation between the rates. We study the period 2005-2008, which includes the recent credit and liquidity crisis. Using community detection, we find that exchange rates that are strongly attached to their community are persistently grouped with the same set of rates, whereas exchange rates that are important for the transfer of information tend to be positioned on the edges of communities. Our analysis successfully uncovers major trading changes that occurred in the market during the credit crisis.
NASA Astrophysics Data System (ADS)
Fenn, Daniel J.; Porter, Mason A.; McDonald, Mark; Williams, Stacy; Johnson, Neil F.; Jones, Nick S.
2009-09-01
We study the cluster dynamics of multichannel (multivariate) time series by representing their correlations as time-dependent networks and investigating the evolution of network communities. We employ a node-centric approach that allows us to track the effects of the community evolution on the functional roles of individual nodes without having to track entire communities. As an example, we consider a foreign exchange market network in which each node represents an exchange rate and each edge represents a time-dependent correlation between the rates. We study the period 2005-2008, which includes the recent credit and liquidity crisis. Using community detection, we find that exchange rates that are strongly attached to their community are persistently grouped with the same set of rates, whereas exchange rates that are important for the transfer of information tend to be positioned on the edges of communities. Our analysis successfully uncovers major trading changes that occurred in the market during the credit crisis.
Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael
2014-10-01
This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.
Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela
2018-01-01
This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.
Reprint of: Relationship between cataract severity and socioeconomic status.
Wesolosky, Jason D; Rudnisky, Christopher J
2015-06-01
To determine the relationship between cataract severity and socioeconomic status (SES). Retrospective, observational case series. A total of 1350 eyes underwent phacoemulsification cataract extraction by a single surgeon using an Alcon Infiniti system. Cataract severity was measured using phaco time in seconds. SES was measured using area-level aggregate census data: median income, education, proportion of common-law couples, and employment rate. Preoperative best corrected visual acuity was obtained and converted to logarithm of the minimum angle of resolution values. For patients undergoing bilateral surgery, the generalized estimating equation was used to account for the correlation between eyes. Univariate analyses were performed using simple regression, and multivariate analyses were performed to account for variables with significant relationships (p < 0.05) on univariate testing. Sensitivity analyses were performed to assess the effect of including patient age in the controlled analyses. Multivariate analyses demonstrated that cataracts were more severe when the median income was lower (p = 0.001) and the proportion of common-law couples living in a patient's community (p = 0.012) and the unemployment rate (p = 0.002) were higher. These associations persisted even when controlling for patient age. Patients of lower SES have more severe cataracts. Copyright © 2015. Published by Elsevier Inc.
Silva, A F; Sarraguça, M C; Fonteyne, M; Vercruysse, J; De Leersnyder, F; Vanhoorne, V; Bostijn, N; Verstraeten, M; Vervaet, C; Remon, J P; De Beer, T; Lopes, J A
2017-08-07
A multivariate statistical process control (MSPC) strategy was developed for the monitoring of the ConsiGma™-25 continuous tablet manufacturing line. Thirty-five logged variables encompassing three major units, being a twin screw high shear granulator, a fluid bed dryer and a product control unit, were used to monitor the process. The MSPC strategy was based on principal component analysis of data acquired under normal operating conditions using a series of four process runs. Runs with imposed disturbances in the dryer air flow and temperature, in the granulator barrel temperature, speed and liquid mass flow and in the powder dosing unit mass flow were utilized to evaluate the model's monitoring performance. The impact of the imposed deviations to the process continuity was also evaluated using Hotelling's T 2 and Q residuals statistics control charts. The influence of the individual process variables was assessed by analyzing contribution plots at specific time points. Results show that the imposed disturbances were all detected in both control charts. Overall, the MSPC strategy was successfully developed and applied. Additionally, deviations not associated with the imposed changes were detected, mainly in the granulator barrel temperature control. Copyright © 2017 Elsevier B.V. All rights reserved.
Multiscale entropy analysis of biological signals: a fundamental bi-scaling law
Gao, Jianbo; Hu, Jing; Liu, Feiyan; Cao, Yinhe
2015-01-01
Since introduced in early 2000, multiscale entropy (MSE) has found many applications in biosignal analysis, and been extended to multivariate MSE. So far, however, no analytic results for MSE or multivariate MSE have been reported. This has severely limited our basic understanding of MSE. For example, it has not been studied whether MSE estimated using default parameter values and short data set is meaningful or not. Nor is it known whether MSE has any relation with other complexity measures, such as the Hurst parameter, which characterizes the correlation structure of the data. To overcome this limitation, and more importantly, to guide more fruitful applications of MSE in various areas of life sciences, we derive a fundamental bi-scaling law for fractal time series, one for the scale in phase space, the other for the block size used for smoothing. We illustrate the usefulness of the approach by examining two types of physiological data. One is heart rate variability (HRV) data, for the purpose of distinguishing healthy subjects from patients with congestive heart failure, a life-threatening condition. The other is electroencephalogram (EEG) data, for the purpose of distinguishing epileptic seizure EEG from normal healthy EEG. PMID:26082711
Hackstadt, Amber J; Peng, Roger D
2014-11-01
Time series studies have suggested that air pollution can negatively impact health. These studies have typically focused on the total mass of fine particulate matter air pollution or the individual chemical constituents that contribute to it, and not source-specific contributions to air pollution. Source-specific contribution estimates are useful from a regulatory standpoint by allowing regulators to focus limited resources on reducing emissions from sources that are major contributors to air pollution and are also desired when estimating source-specific health effects. However, researchers often lack direct observations of the emissions at the source level. We propose a Bayesian multivariate receptor model to infer information about source contributions from ambient air pollution measurements. The proposed model incorporates information from national databases containing data on both the composition of source emissions and the amount of emissions from known sources of air pollution. The proposed model is used to perform source apportionment analyses for two distinct locations in the United States (Boston, Massachusetts and Phoenix, Arizona). Our results mirror previous source apportionment analyses that did not utilize the information from national databases and provide additional information about uncertainty that is relevant to the estimation of health effects.
Janik, M; Bossew, P; Kurihara, O
2018-07-15
Machine learning is a class of statistical techniques which has proven to be a powerful tool for modelling the behaviour of complex systems, in which response quantities depend on assumed controls or predictors in a complicated way. In this paper, as our first purpose, we propose the application of machine learning to reconstruct incomplete or irregularly sampled data of time series indoor radon ( 222 Rn). The physical assumption underlying the modelling is that Rn concentration in the air is controlled by environmental variables such as air temperature and pressure. The algorithms "learn" from complete sections of multivariate series, derive a dependence model and apply it to sections where the controls are available, but not the response (Rn), and in this way complete the Rn series. Three machine learning techniques are applied in this study, namely random forest, its extension called the gradient boosting machine and deep learning. For a comparison, we apply the classical multiple regression in a generalized linear model version. Performance of the models is evaluated through different metrics. The performance of the gradient boosting machine is found to be superior to that of the other techniques. By applying learning machines, we show, as our second purpose, that missing data or periods of Rn series data can be reconstructed and resampled on a regular grid reasonably, if data of appropriate physical controls are available. The techniques also identify to which degree the assumed controls contribute to imputing missing Rn values. Our third purpose, though no less important from the viewpoint of physics, is identifying to which degree physical, in this case environmental variables, are relevant as Rn predictors, or in other words, which predictors explain most of the temporal variability of Rn. We show that variables which contribute most to the Rn series reconstruction, are temperature, relative humidity and day of the year. The first two are physical predictors, while "day of the year" is a statistical proxy or surrogate for missing or unknown predictors. Copyright © 2018 Elsevier B.V. All rights reserved.
Can Community Policing Help the Truly Disadvantaged?
ERIC Educational Resources Information Center
Reisig, Michael D.; Parks, Roger B.
2004-01-01
Community policing advocates argue that reforms designed to break down barriers between police and citizens can produce favorable outcomes. The authors test a series of related hypotheses in a multivariate context by using four independent data sources--community surveys, patrol officer interviews, Census Bureau, and police crime records--to…
NASA Technical Reports Server (NTRS)
Szuch, J. R.; Soeder, J. F.; Seldner, K.; Cwynar, D. S.
1977-01-01
The design, evaluation, and testing of a practical, multivariable, linear quadratic regulator control for the F100 turbofan engine were accomplished. NASA evaluation of the multivariable control logic and implementation are covered. The evaluation utilized a real time, hybrid computer simulation of the engine. Results of the evaluation are presented, and recommendations concerning future engine testing of the control are made. Results indicated that the engine testing of the control should be conducted as planned.
Wright, Stephen T; Hoy, Jennifer; Mulhall, Brian; O’Connor, Catherine C; Petoumenos, Kathy; Read, Timothy; Smith, Don; Woolley, Ian; Boyd, Mark A
2014-01-01
Background Recent studies suggest higher cumulative HIV viraemia exposure measured as viraemia copy-years (VCY) is associated with increased all-cause mortality. The objectives of this study are (a) report the association between VCY and all-cause mortality, and (b) assess associations between common patient characteristics and VCY. Methods Analyses were based on patients recruited to the Australian HIV Observational Database (AHOD) who had received ≥ 24 weeks of antiretroviral therapy (ART). We established VCY after 1, 3, 5 and 10 years of ART by calculating the area under the plasma viral load time-series. We used survival methods to determine the association between high VCY and all-cause mortality. We used multivariable mixed-effect models to determine predictors of VCY. We compared a baseline information model with a time-updated model to evaluate discrimination of patients with high VCY. Results Of the 3021 AHOD participants that initiated ART, 2073(69%), 1667(55%), 1267(42%) and 638(21%) were eligible for analysis at 1, 3, 5, 10 years of ART respectively. Multivariable adjusted hazard ratio (HR) association between all-cause mortality and high VCY was statistically significant, HR 1.52(1.09, 2.13), p-value=0.01. Predicting high VCY after one-year of ART for a time-updated model compared to a baseline information only model, the area under the sensitivity/specificity curve (AUC) was 0.92 vs. 0.84; and at 10 years of ART, AUC: 0.87 vs. 0.61 respectively. Conclusion A high cumulative measure of viral load after initiating ART is associated with increased risk of all-cause mortality. Identifying patients with high VCY is improved by incorporating time-updated information. PMID:24463783
Characteristics Associated with In-Hospital Death among Commercially Insured Decedents with Cancer.
Brooks, Gabriel A; Stuver, Sherri O; Zhang, Yichen; Gottsch, Stephanie; Fraile, Belen; McNiff, Kristen; Dodek, Anton; Jacobson, Joseph O
2017-01-01
A majority of patients with poor-prognosis cancer express a preference for in-home death; however, in-hospital deaths are common. We sought to identify characteristics associated with in-hospital death. Case series. Commercially insured patients with cancer who died between July 2010 and December 2013 and who had at least two outpatient visits at a tertiary cancer center during the last six months of life. Patient characteristics, healthcare utilization, and in-hospital death (primary outcome) were ascertained from institutional records and healthcare claims. Bivariate and multivariable analyses were used to evaluate the association of in-hospital death with patient characteristics and end-of-life outcome measures. We identified 904 decedents, with a median age of 59 years at death. In-hospital death was observed in 254 patients (28%), including 110 (12%) who died in an intensive care unit. Hematologic malignancy was associated with a 2.57 times increased risk of in-hospital death (95% confidence interval [CI] 1.91-3.45, p < 0.001), and nonenrollment in hospice was associated with a 14.5 times increased risk of in-hospital death (95% CI 9.81-21.4, p < 0.001). Time from cancer diagnosis to death was also associated with in-hospital death (p = 0.003), with the greatest risk among patients dying within six months of cancer diagnosis. All significant associations persisted in multivariable analyses that were adjusted for baseline characteristics. In-hospital deaths are common among commercially insured cancer patients. Patients with hematologic malignancy and patients who die without receiving hospice services have a substantially higher incidence of in-hospital death.
Heinsch, Stephen C.; Das, Siba R.; Smanski, Michael J.
2018-01-01
Increasing the final titer of a multi-gene metabolic pathway can be viewed as a multivariate optimization problem. While numerous multivariate optimization algorithms exist, few are specifically designed to accommodate the constraints posed by genetic engineering workflows. We present a strategy for optimizing expression levels across an arbitrary number of genes that requires few design-build-test iterations. We compare the performance of several optimization algorithms on a series of simulated expression landscapes. We show that optimal experimental design parameters depend on the degree of landscape ruggedness. This work provides a theoretical framework for designing and executing numerical optimization on multi-gene systems. PMID:29535690
Anwar, Abdul Rauf; Muthalib, Makii; Perrey, Stephane; Galka, Andreas; Granert, Oliver; Wolff, Stephan; Deuschl, Guenther; Raethjen, Jan; Heute, Ulrich; Muthuraman, Muthuraman
2013-01-01
Brain activity can be measured using different modalities. Since most of the modalities tend to complement each other, it seems promising to measure them simultaneously. In to be presented research, the data recorded from Functional Magnetic Resonance Imaging (fMRI) and Near Infrared Spectroscopy (NIRS), simultaneously, are subjected to causality analysis using time-resolved partial directed coherence (tPDC). Time-resolved partial directed coherence uses the principle of state space modelling to estimate Multivariate Autoregressive (MVAR) coefficients. This method is useful to visualize both frequency and time dynamics of causality between the time series. Afterwards, causality results from different modalities are compared by estimating the Spearman correlation. In to be presented study, we used directionality vectors to analyze correlation, rather than actual signal vectors. Results show that causality analysis of the fMRI correlates more closely to causality results of oxy-NIRS as compared to deoxy-NIRS in case of a finger sequencing task. However, in case of simple finger tapping, no clear difference between oxy-fMRI and deoxy-fMRI correlation is identified.
Decreasing triage time: effects of implementing a step-wise ESI algorithm in an EHR.
Villa, Stephen; Weber, Ellen J; Polevoi, Steven; Fee, Christopher; Maruoka, Andrew; Quon, Tina
2018-06-01
To determine if adapting a widely-used triage scale into a computerized algorithm in an electronic health record (EHR) shortens emergency department (ED) triage time. Before-and-after quasi-experimental study. Urban, tertiary care hospital ED. Consecutive adult patient visits between July 2011 and June 2013. A step-wise algorithm, based on the Emergency Severity Index (ESI-5) was programmed into the triage module of a commercial EHR. Duration of triage (triage interval) for all patients and change in percentage of high acuity patients (ESI 1 and 2) completing triage within 15 min, 12 months before-and-after implementation of the algorithm. Multivariable analysis adjusted for confounders; interrupted time series demonstrated effects over time. Secondary outcomes examined quality metrics and patient flow. About 32 546 patient visits before and 33 032 after the intervention were included. Post-intervention patients were slightly older, census was higher and admission rate slightly increased. Median triage interval was 5.92 min (interquartile ranges, IQR 4.2-8.73) before and 2.8 min (IQR 1.88-4.23) after the intervention (P < 0.001). Adjusted mean triage interval decreased 3.4 min (95% CI: -3.6, -3.2). The proportion of high acuity patients completing triage within 15 min increased from 63.9% (95% CI 62.5, 65.2%) to 75.0% (95% CI 73.8, 76.1). Monthly time series demonstrated immediate and sustained improvement following the intervention. Return visits within 72 h and door-to-balloon time were unchanged. Total length of stay was similar. The computerized triage scale improved speed of triage, allowing more high acuity patients to be seen within recommended timeframes, without notable impact on quality.
Forecasting paediatric malaria admissions on the Kenya Coast using rainfall.
Karuri, Stella Wanjugu; Snow, Robert W
2016-01-01
Malaria is a vector-borne disease which, despite recent scaled-up efforts to achieve control in Africa, continues to pose a major threat to child survival. The disease is caused by the protozoan parasite Plasmodium and requires mosquitoes and humans for transmission. Rainfall is a major factor in seasonal and secular patterns of malaria transmission along the East African coast. The goal of the study was to develop a model to reliably forecast incidences of paediatric malaria admissions to Kilifi District Hospital (KDH). In this article, we apply several statistical models to look at the temporal association between monthly paediatric malaria hospital admissions, rainfall, and Indian Ocean sea surface temperatures. Trend and seasonally adjusted, marginal and multivariate, time-series models for hospital admissions were applied to a unique data set to examine the role of climate, seasonality, and long-term anomalies in predicting malaria hospital admission rates and whether these might become more or less predictable with increasing vector control. The proportion of paediatric admissions to KDH that have malaria as a cause of admission can be forecast by a model which depends on the proportion of malaria admissions in the previous 2 months. This model is improved by incorporating either the previous month's Indian Ocean Dipole information or the previous 2 months' rainfall. Surveillance data can help build time-series prediction models which can be used to anticipate seasonal variations in clinical burdens of malaria in stable transmission areas and aid the timing of malaria vector control.
Temporal dynamics of physical activity and affect in depressed and nondepressed individuals.
Stavrakakis, Nikolaos; Booij, Sanne H; Roest, Annelieke M; de Jonge, Peter; Oldehinkel, Albertine J; Bos, Elisabeth H
2015-12-01
The association between physical activity and affect found in longitudinal observational studies is generally small to moderate. It is unknown how this association generalizes to individuals. The aim of the present study was to investigate interindividual differences in the bidirectional dynamic relationship between physical activity and affect, in depressed and nondepressed individuals, using time-series analysis. A pair-matched sample of 10 depressed and 10 nondepressed participants (mean age = 36.6, SD = 8.9, 30% males) wore accelerometers and completed electronic questionnaires 3 times a day for 30 days. Physical activity was operationalized as the total energy expenditure (EE) per day segment (i.e., 6 hr). The multivariate time series (T = 90) of every individual were analyzed using vector autoregressive modeling (VAR), with the aim to assess direct as well as lagged (i.e., over 1 day) effects of EE on positive and negative affect, and vice versa. Large interindividual differences in the strength, direction and temporal aspects of the relationship between physical activity and positive and negative affect were observed. An exception was the direct (but not the lagged) effect of physical activity on positive affect, which was positive in nearly all individuals. This study showed that the association between physical activity and affect varied considerably across individuals. Thus, while at the group level the effect of physical activity on affect may be small, in some individuals the effect may be clinically relevant. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Time-localized wavelet multiple regression and correlation
NASA Astrophysics Data System (ADS)
Fernández-Macho, Javier
2018-02-01
This paper extends wavelet methodology to handle comovement dynamics of multivariate time series via moving weighted regression on wavelet coefficients. The concept of wavelet local multiple correlation is used to produce one single set of multiscale correlations along time, in contrast with the large number of wavelet correlation maps that need to be compared when using standard pairwise wavelet correlations with rolling windows. Also, the spectral properties of weight functions are investigated and it is argued that some common time windows, such as the usual rectangular rolling window, are not satisfactory on these grounds. The method is illustrated with a multiscale analysis of the comovements of Eurozone stock markets during this century. It is shown how the evolution of the correlation structure in these markets has been far from homogeneous both along time and across timescales featuring an acute divide across timescales at about the quarterly scale. At longer scales, evidence from the long-term correlation structure can be interpreted as stable perfect integration among Euro stock markets. On the other hand, at intramonth and intraweek scales, the short-term correlation structure has been clearly evolving along time, experiencing a sharp increase during financial crises which may be interpreted as evidence of financial 'contagion'.
Facebook Usage Patterns and School Attitudes
ERIC Educational Resources Information Center
Koles, Bernadett; Nagy, Peter
2012-01-01
Purpose: The purpose of this paper is to explore teenagers' and young adults' use of social networking sites (SNS), in light of certain personal, social and educational outcomes and attitudes. Design/methodology/approach: Data were gathered on the basis of surveys, and were analyzed through a series of multivariate models. Findings: It was found…
Measurement of Physiologic Glucose Levels Using Raman Spectroscopy in a Rabbit Aqueous Humor Model
NASA Technical Reports Server (NTRS)
Lambert, J.; Storrie-Lombardi, M.; Borchert, M.
1998-01-01
We have elecited a reliable glucose signature in mammalian physiological ranges using near infrared Raman laser excitation at 785 nm and multivariate analysis. In a recent series of experiments we measured glucose levels in an artificial aqueous humor in the range from 0.5 to 13X normal values.
Processes Linking Social Class and Racial Socialization in African American Dual-Earner Families
ERIC Educational Resources Information Center
Crouter, Ann C.; Baril, Megan E.; Davis, Kelly D.; McHale, Susan M.
2008-01-01
We examined the links between social class, occupational self-direction, self-efficacy, and racial socialization in a sample of 128 two-parent African American couples raising adolescents. A series of multivariate, multilevel models revealed that mothers' SES was connected to self-efficacy via its association with occupational self-direction; in…
Computers and Student Learning: Interpreting the Multivariate Analysis of PISA 2000
ERIC Educational Resources Information Center
Bielefeldt, Talbot
2005-01-01
In November 2004, economists Thomas Fuchs and Ludger Woessmann published a statistical analysis of the relationship between technology and student achievement using year 2000 data from the Programme for International Student Assessment (PISA). The 2000 PISA was the first in a series of triennial assessments of 15-year-olds conducted by the…
Multivariate Approaches for Exploring the Evaluation of Deception in Television Advertising.
ERIC Educational Resources Information Center
Permut, Steven Eli
The objective of this study was to explore the semantic structure used by subjects in assessing (evaluating) a series of eight television commercials previously (but unofficially) rated for deceptiveness by FTC attorneys. Five local respondent groups were used: 158 undergraduate students enrolled in an introductory advertising course, 175…
NASA Astrophysics Data System (ADS)
Leauthaud, Crystele; Cappelaere, Bernard; Demarty, Jérôme; Guichard, Françoise; Velluet, Cécile; Kergoat, Laurent; Vischel, Théo; Grippa, Manuela; Mouhaimouni, Mohammed; Bouzou Moussa, Ibrahim; Mainassara, Ibrahim; Sultan, Benjamin
2017-04-01
The Sahel has experienced strong climate variability in the past decades. Understanding its implications for natural and cultivated ecosystems is pivotal in a context of high population growth and mainly agriculture-based livelihoods. However, efforts to model processes at the land-atmosphere interface are hindered, particularly when the multi-decadal timescale is targeted, as climatic data are scarce, largely incomplete and often unreliable. This study presents the generation of a long-term, high-temporal resolution, multivariate local climatic data set for Niamey, Central Sahel. The continuous series spans the period 1950-2009 at a 30-min timescale and includes ground station-based meteorological variables (precipitation, air temperature, relative and specific humidity, air pressure, wind speed, downwelling long- and short-wave radiation) as well as process-modelled surface fluxes (upwelling long- and short-wave radiation,latent, sensible and soil heat fluxes and surface temperature). A combination of complementary techniques (linear/spline regressions, a multivariate analogue method, artificial neural networks and recursive gap filling) was used to reconstruct missing meteorological data. The complete surface energy budget was then obtained for two dominant land cover types, fallow bush and millet, by applying the meteorological forcing data set to a finely field-calibrated land surface model. Uncertainty in reconstructed data was expressed by means of a stochastic ensemble of plausible historical time series. Climatological statistics were computed at sub-daily to decadal timescales and compared with local, regional and global data sets such as CRU and ERA-Interim. The reconstructed precipitation statistics, ˜1°C increase in mean annual temperature from 1950 to 2009, and mean diurnal and annual cycles for all variables were in good agreement with previous studies. The new data set, denoted NAD (Niamey Airport-derived set) and publicly available, can be used to investigate the water and energy cycles in Central Sahel, while the methodology can be applied to reconstruct series at other stations. The study has been published in Int. J. Climatol. (2016), DOI: 10.1002/joc.4874
Machine learning for neuroimaging with scikit-learn.
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
Machine learning for neuroimaging with scikit-learn
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain. PMID:24600388
Papadia, Andrea; Bellati, Filippo; Bogani, Giorgio; Ditto, Antonino; Martinelli, Fabio; Lorusso, Domenica; Donfrancesco, Cristina; Gasparri, Maria Luisa; Raspagliesi, Francesco
2015-12-01
The aim of this study was to identify clinical variables that may predict the need for adjuvant radiotherapy after neoadjuvant chemotherapy (NACT) and radical surgery in locally advanced cervical cancer patients. A retrospective series of cervical cancer patients with International Federation of Gynecology and Obstetrics (FIGO) stages IB2-IIB treated with NACT followed by radical surgery was analyzed. Clinical predictors of persistence of intermediate- and/or high-risk factors at final pathological analysis were investigated. Statistical analysis was performed using univariate and multivariate analysis and using a model based on artificial intelligence known as artificial neuronal network (ANN) analysis. Overall, 101 patients were available for the analyses. Fifty-two (51 %) patients were considered at high risk secondary to parametrial, resection margin and/or lymph node involvement. When disease was confined to the cervix, four (4 %) patients were considered at intermediate risk. At univariate analysis, FIGO grade 3, stage IIB disease at diagnosis and the presence of enlarged nodes before NACT predicted the presence of intermediate- and/or high-risk factors at final pathological analysis. At multivariate analysis, only FIGO grade 3 and tumor diameter maintained statistical significance. The specificity of ANN models in evaluating predictive variables was slightly superior to conventional multivariable models. FIGO grade, stage, tumor diameter, and histology are associated with persistence of pathological intermediate- and/or high-risk factors after NACT and radical surgery. This information is useful in counseling patients at the time of treatment planning with regard to the probability of being subjected to pelvic radiotherapy after completion of the initially planned treatment.
Multivariate and Multiscale Data Assimilation in Terrestrial Systems: A Review
Montzka, Carsten; Pauwels, Valentijn R. N.; Franssen, Harrie-Jan Hendricks; Han, Xujun; Vereecken, Harry
2012-01-01
More and more terrestrial observational networks are being established to monitor climatic, hydrological and land-use changes in different regions of the World. In these networks, time series of states and fluxes are recorded in an automated manner, often with a high temporal resolution. These data are important for the understanding of water, energy, and/or matter fluxes, as well as their biological and physical drivers and interactions with and within the terrestrial system. Similarly, the number and accuracy of variables, which can be observed by spaceborne sensors, are increasing. Data assimilation (DA) methods utilize these observations in terrestrial models in order to increase process knowledge as well as to improve forecasts for the system being studied. The widely implemented automation in observing environmental states and fluxes makes an operational computation more and more feasible, and it opens the perspective of short-time forecasts of the state of terrestrial systems. In this paper, we review the state of the art with respect to DA focusing on the joint assimilation of observational data precedents from different spatial scales and different data types. An introduction is given to different DA methods, such as the Ensemble Kalman Filter (EnKF), Particle Filter (PF) and variational methods (3/4D-VAR). In this review, we distinguish between four major DA approaches: (1) univariate single-scale DA (UVSS), which is the approach used in the majority of published DA applications, (2) univariate multiscale DA (UVMS) referring to a methodology which acknowledges that at least some of the assimilated data are measured at a different scale than the computational grid scale, (3) multivariate single-scale DA (MVSS) dealing with the assimilation of at least two different data types, and (4) combined multivariate multiscale DA (MVMS). Finally, we conclude with a discussion on the advantages and disadvantages of the assimilation of multiple data types in a simulation model. Existing approaches can be used to simultaneously update several model states and model parameters if applicable. In other words, the basic principles for multivariate data assimilation are already available. We argue that a better understanding of the measurement errors for different observation types, improved estimates of observation bias and improved multiscale assimilation methods for data which scale nonlinearly is important to properly weight them in multiscale multivariate data assimilation. In this context, improved cross-validation of different data types, and increased ground truth verification of remote sensing products are required. PMID:23443380
Multivariate and multiscale data assimilation in terrestrial systems: a review.
Montzka, Carsten; Pauwels, Valentijn R N; Franssen, Harrie-Jan Hendricks; Han, Xujun; Vereecken, Harry
2012-11-26
More and more terrestrial observational networks are being established to monitor climatic, hydrological and land-use changes in different regions of the World. In these networks, time series of states and fluxes are recorded in an automated manner, often with a high temporal resolution. These data are important for the understanding of water, energy, and/or matter fluxes, as well as their biological and physical drivers and interactions with and within the terrestrial system. Similarly, the number and accuracy of variables, which can be observed by spaceborne sensors, are increasing. Data assimilation (DA) methods utilize these observations in terrestrial models in order to increase process knowledge as well as to improve forecasts for the system being studied. The widely implemented automation in observing environmental states and fluxes makes an operational computation more and more feasible, and it opens the perspective of short-time forecasts of the state of terrestrial systems. In this paper, we review the state of the art with respect to DA focusing on the joint assimilation of observational data precedents from different spatial scales and different data types. An introduction is given to different DA methods, such as the Ensemble Kalman Filter (EnKF), Particle Filter (PF) and variational methods (3/4D-VAR). In this review, we distinguish between four major DA approaches: (1) univariate single-scale DA (UVSS), which is the approach used in the majority of published DA applications, (2) univariate multiscale DA (UVMS) referring to a methodology which acknowledges that at least some of the assimilated data are measured at a different scale than the computational grid scale, (3) multivariate single-scale DA (MVSS) dealing with the assimilation of at least two different data types, and (4) combined multivariate multiscale DA (MVMS). Finally, we conclude with a discussion on the advantages and disadvantages of the assimilation of multiple data types in a simulation model. Existing approaches can be used to simultaneously update several model states and model parameters if applicable. In other words, the basic principles for multivariate data assimilation are already available. We argue that a better understanding of the measurement errors for different observation types, improved estimates of observation bias and improved multiscale assimilation methods for data which scale nonlinearly is important to properly weight them in multiscale multivariate data assimilation. In this context, improved cross-validation of different data types, and increased ground truth verification of remote sensing products are required.
Evaluation of an F100 multivariable control using a real-time engine simulation
NASA Technical Reports Server (NTRS)
Szuch, J. R.; Skira, C.; Soeder, J. F.
1977-01-01
A multivariable control design for the F100 turbofan engine was evaluated, as part of the F100 multivariable control synthesis (MVCS) program. The evaluation utilized a real-time, hybrid computer simulation of the engine and a digital computer implementation of the control. Significant results of the evaluation are presented and recommendations concerning future engine testing of the control are made.
Predicting Geomorphic and Hydrologic Risks after Wildfire Using Harmonic and Stochastic Analyses
NASA Astrophysics Data System (ADS)
Mikesell, J.; Kinoshita, A. M.; Florsheim, J. L.; Chin, A.; Nourbakhshbeidokhti, S.
2017-12-01
Wildfire is a landscape-scale disturbance that often alters hydrological processes and sediment flux during subsequent storms. Vegetation loss from wildfires induce changes to sediment supply such as channel erosion and sedimentation and streamflow magnitude or flooding. These changes enhance downstream hazards, threatening human populations and physical aquatic habitat over various time scales. Using Williams Canyon, a basin burned by the Waldo Canyon Fire (2012) as a case study, we utilize deterministic and statistical modeling methods (Fourier series and first order Markov chain) to assess pre- and post-fire geomorphic and hydrologic characteristics, including of precipitation, enhanced vegetation index (EVI, a satellite-based proxy of vegetation biomass), streamflow, and sediment flux. Local precipitation, terrestrial Light Detection and Ranging (LiDAR) scanning, and satellite-based products are used for these time series analyses. We present a framework to assess variability of periodic and nonperiodic climatic and multivariate trends to inform development of a post-wildfire risk assessment methodology. To establish the extent to which a wildfire affects hydrologic and geomorphic patterns, a Fourier series was used to fit pre- and post-fire geomorphic and hydrologic characteristics to yearly temporal cycles and subcycles of 6, 4, 3, and 2.4 months. These cycles were analyzed using least-squares estimates of the harmonic coefficients or amplitudes of each sub-cycle's contribution to fit the overall behavior of a Fourier series. The stochastic variances of these characteristics were analyzed by composing first-order Markov models and probabilistic analysis through direct likelihood estimates. Preliminary results highlight an increased dependence of monthly post-fire hydrologic characteristics on 12 and 6-month temporal cycles. This statistical and probabilistic analysis provides a basis to determine the impact of wildfires on the temporal dependence of geomorphic and hydrologic characteristics, which can be incorporated into post-fire mitigation, management, and recovery-based measures to protect and rehabilitate areas subject to influence from wildfires.
Two dynamic regimes in the human gut microbiome
Smillie, Chris S.; Alm, Eric J.
2017-01-01
The gut microbiome is a dynamic system that changes with host development, health, behavior, diet, and microbe-microbe interactions. Prior work on gut microbial time series has largely focused on autoregressive models (e.g. Lotka-Volterra). However, we show that most of the variance in microbial time series is non-autoregressive. In addition, we show how community state-clustering is flawed when it comes to characterizing within-host dynamics and that more continuous methods are required. Most organisms exhibited stable, mean-reverting behavior suggestive of fixed carrying capacities and abundant taxa were largely shared across individuals. This mean-reverting behavior allowed us to apply sparse vector autoregression (sVAR)—a multivariate method developed for econometrics—to model the autoregressive component of gut community dynamics. We find a strong phylogenetic signal in the non-autoregressive co-variance from our sVAR model residuals, which suggests niche filtering. We show how changes in diet are also non-autoregressive and that Operational Taxonomic Units strongly correlated with dietary variables have much less of an autoregressive component to their variance, which suggests that diet is a major driver of microbial dynamics. Autoregressive variance appears to be driven by multi-day recovery from frequent facultative anaerobe blooms, which may be driven by fluctuations in luminal redox. Overall, we identify two dynamic regimes within the human gut microbiota: one likely driven by external environmental fluctuations, and the other by internal processes. PMID:28222117
NASA Astrophysics Data System (ADS)
Weiss, S. B.
2017-12-01
Impacts of climate change in the Great Basin will manifest through changes in the hydrologic cycle. Downscaled climate data and projections run through the Basin Characterization Model (BCM) produce time series of hydrologic response - recharge, runoff, actual evapotranspiration (AET), and climatic water deficit (CWD) - that directly affect water resources and vegetation. More than 50 climate projections from CMIP5 were screened using a cluster analysis of end-century (2077-2099) seasonal precipitation and annual temperature to produce a reduced subset of 12 climate futures that cover a wide range of macroclimate response. Importantly, variations among GCMs in summer precipitation produced by the SW monsoon are captured. Data were averaged within 84 HUC8 watersheds with widley varying climate, topography, and geology. Resultant time series allow for multivariate analysis of hydrologic response, especially partitioning between snowpack, recharge, runoff, and actual evapotranspiration. Because the bulk of snowpack accumulation is restricted to small areas of isolated mountain ranges, losses of snowpack can be extreme as snowline moves up the mountains with warming. Loss of snowpack also affects recharge and runoff rates, and importantly, the recharge/runoff ratio - as snowpacks fade, recharge tends to increase relative to runoff. Thresholds for regime shifts can be identified, but the unique topography and geology of each basin must be considered in assessing hydrologic response.
Two dynamic regimes in the human gut microbiome.
Gibbons, Sean M; Kearney, Sean M; Smillie, Chris S; Alm, Eric J
2017-02-01
The gut microbiome is a dynamic system that changes with host development, health, behavior, diet, and microbe-microbe interactions. Prior work on gut microbial time series has largely focused on autoregressive models (e.g. Lotka-Volterra). However, we show that most of the variance in microbial time series is non-autoregressive. In addition, we show how community state-clustering is flawed when it comes to characterizing within-host dynamics and that more continuous methods are required. Most organisms exhibited stable, mean-reverting behavior suggestive of fixed carrying capacities and abundant taxa were largely shared across individuals. This mean-reverting behavior allowed us to apply sparse vector autoregression (sVAR)-a multivariate method developed for econometrics-to model the autoregressive component of gut community dynamics. We find a strong phylogenetic signal in the non-autoregressive co-variance from our sVAR model residuals, which suggests niche filtering. We show how changes in diet are also non-autoregressive and that Operational Taxonomic Units strongly correlated with dietary variables have much less of an autoregressive component to their variance, which suggests that diet is a major driver of microbial dynamics. Autoregressive variance appears to be driven by multi-day recovery from frequent facultative anaerobe blooms, which may be driven by fluctuations in luminal redox. Overall, we identify two dynamic regimes within the human gut microbiota: one likely driven by external environmental fluctuations, and the other by internal processes.
NASA Astrophysics Data System (ADS)
Eroglu, Deniz; Marwan, Norbert
2017-04-01
The complex nature of a variety of phenomena in physical, biological, or earth sciences is driven by a large number of degrees of freedom which are strongly interconnected. Although the evolution of such systems is described by multivariate time series (MTS), so far research mostly focuses on analyzing these components one by one. Recurrence based analyses are powerful methods to understand the underlying dynamics of a dynamical system and have been used for many successful applications including examples from earth science, economics, or chemical reactions. The backbone of these techniques is creating the phase space of the system. However, increasing the dimension of a system requires increasing the length of the time series in order get significant and reliable results. This requirement is one of the challenges in many disciplines, in particular in palaeoclimate, thus, it is not easy to create a phase space from measured MTS due to the limited number of available obervations (samples). To overcome this problem, we suggest to create recurrence networks from each component of the system and combine them into a multiplex network structure, the multiplex recurrence network (MRN). We test the MRN by using prototypical mathematical models and demonstrate its use by studying high-dimensional palaeoclimate dynamics derived from pollen data from the Bear Lake (Utah, US). By using the MRN, we can distinguish typical climate transition events, e.g., such between Marine Isotope Stages.
NASA Astrophysics Data System (ADS)
Murawski, Aline; Bürger, Gerd; Vorogushyn, Sergiy; Merz, Bruno
2016-04-01
The use of a weather pattern based approach for downscaling of coarse, gridded atmospheric data, as usually obtained from the output of general circulation models (GCM), allows for investigating the impact of anthropogenic greenhouse gas emissions on fluxes and state variables of the hydrological cycle such as e.g. on runoff in large river catchments. Here we aim at attributing changes in high flows in the Rhine catchment to anthropogenic climate change. Therefore we run an objective classification scheme (simulated annealing and diversified randomisation - SANDRA, available from the cost733 classification software) on ERA20C reanalyses data and apply the established classification to GCMs from the CMIP5 project. After deriving weather pattern time series from GCM runs using forcing from all greenhouse gases (All-Hist) and using natural greenhouse gas forcing only (Nat-Hist), a weather generator will be employed to obtain climate data time series for the hydrological model. The parameters of the weather pattern classification (i.e. spatial extent, number of patterns, classification variables) need to be selected in a way that allows for good stratification of the meteorological variables that are of interest for the hydrological modelling. We evaluate the skill of the classification in stratifying meteorological data using a multi-variable approach. This allows for estimating the stratification skill for all meteorological variables together, not separately as usually done in existing similar work. The advantage of the multi-variable approach is to properly account for situations where e.g. two patterns are associated with similar mean daily temperature, but one pattern is dry while the other one is related to considerable amounts of precipitation. Thus, the separation of these two patterns would not be justified when considering temperature only, but is perfectly reasonable when accounting for precipitation as well. Besides that, the weather patterns derived from reanalyses data should be well represented in the All-Hist GCM runs in terms of e.g. frequency, seasonality, and persistence. In this contribution we show how to select the most appropriate weather pattern classification and how the classes derived from it are reflected in the GCMs.
Evaluating the impact of a mandatory pre-abortion ultrasound viewing law: A mixed methods study.
Upadhyay, Ushma D; Kimport, Katrina; Belusa, Elise K O; Johns, Nicole E; Laube, Douglas W; Roberts, Sarah C M
2017-01-01
Since mid-2013, Wisconsin abortion providers have been legally required to display and describe pre-abortion ultrasound images. We aimed to understand the impact of this law. We used a mixed-methods study design at an abortion facility in Wisconsin. We abstracted data from medical charts one year before the law to one year after and used multivariable models, mediation/moderation analysis, and interrupted time series to assess the impact of the law, viewing, and decision certainty on likelihood of continuing the pregnancy. We conducted in-depth interviews with women in the post-law period about their ultrasound experience and analyzed them using elaborative and modified grounded theory. A total of 5342 charts were abstracted; 8.7% continued their pregnancies pre-law and 11.2% post-law (p = 0.002). A multivariable model confirmed the law was associated with higher odds of continuing pregnancy (aOR = 1.23, 95% CI: 1.01-1.50). Decision certainty (aOR = 6.39, 95% CI: 4.72-8.64) and having to pay fully out of pocket (aOR = 4.98, 95% CI: 3.86-6.41) were most strongly associated with continuing pregnancy. Ultrasound viewing fully mediated the relationship between the law and continuing pregnancy. Interrupted time series analyses found no significant effect of the law but may have been underpowered to detect such a small effect. Nineteen of twenty-three women interviewed viewed their ultrasound image. Most reported no impact on their abortion decision; five reported a temporary emotional impact or increased certainty about choosing abortion. Two women reported that viewing helped them decide to continue the pregnancy; both also described preexisting decision uncertainty. This law caused an increase in viewing rates and a statistically significant but small increase in continuing pregnancy rates. However, the majority of women were certain of their abortion decision and the law did not change their decision. Other factors were more significant in women's decision-making, suggesting evaluations of restrictive laws should take account of the broader social environment.
NASA Astrophysics Data System (ADS)
Binder, Kyle Edwin
The U.S. energy sector has undergone continuous change in the regulatory, technological, and market environments. These developments show no signs of slowing. Accordingly, it is imperative that energy market regulators and participants develop a strong comprehension of market dynamics and the potential implications of their actions. This dissertation contributes to a better understanding of the past, present, and future of U.S. energy market dynamics and interactions with policy. Advancements in multivariate time series analysis are employed in three related studies of the electric power sector. Overall, results suggest that regulatory changes have had and will continue to have important implications for the electric power sector. The sector, however, has exhibited adaptability to past regulatory changes and is projected to remain resilient in the future. Tests for constancy of the long run parameters in a vector error correction model are applied to determine whether relationships among coal inventories in the electric power sector, input prices, output prices, and opportunity costs have remained constant over the past 38 years. Two periods of instability are found, the first following railroad deregulation in the U.S. and the second corresponding to a number of major regulatory changes in the electric power and natural gas sectors. Relationships among Renewable Energy Credit prices, electricity prices, and natural gas prices are estimated using a vector error correction model. Results suggest that Renewable Energy Credit prices do not completely behave as previously theorized in the literature. Potential reasons for the divergence between theory and empirical evidence are the relative immaturity of current markets and continuous institutional intervention. Potential impacts of future CO2 emissions reductions under the Clean Power Plan on economic and energy sector activity are estimated. Conditional forecasts based on an outlined path for CO2 emissions are developed from a factor-augmented vector autoregressive model for a large dataset. Unconditional and conditional forecasts are compared for U.S. industrial production, real personal income, and estimated factors. Results suggest that economic growth will be slower under the Clean Power Plan than it would otherwise; however, CO2 emissions reductions and economic growth can be achieved simultaneously.
Evaluating the impact of a mandatory pre-abortion ultrasound viewing law: A mixed methods study
Kimport, Katrina; Belusa, Elise K. O.; Johns, Nicole E.; Laube, Douglas W.; Roberts, Sarah C. M.
2017-01-01
Background Since mid-2013, Wisconsin abortion providers have been legally required to display and describe pre-abortion ultrasound images. We aimed to understand the impact of this law. Methods We used a mixed-methods study design at an abortion facility in Wisconsin. We abstracted data from medical charts one year before the law to one year after and used multivariable models, mediation/moderation analysis, and interrupted time series to assess the impact of the law, viewing, and decision certainty on likelihood of continuing the pregnancy. We conducted in-depth interviews with women in the post-law period about their ultrasound experience and analyzed them using elaborative and modified grounded theory. Results A total of 5342 charts were abstracted; 8.7% continued their pregnancies pre-law and 11.2% post-law (p = 0.002). A multivariable model confirmed the law was associated with higher odds of continuing pregnancy (aOR = 1.23, 95% CI: 1.01–1.50). Decision certainty (aOR = 6.39, 95% CI: 4.72–8.64) and having to pay fully out of pocket (aOR = 4.98, 95% CI: 3.86–6.41) were most strongly associated with continuing pregnancy. Ultrasound viewing fully mediated the relationship between the law and continuing pregnancy. Interrupted time series analyses found no significant effect of the law but may have been underpowered to detect such a small effect. Nineteen of twenty-three women interviewed viewed their ultrasound image. Most reported no impact on their abortion decision; five reported a temporary emotional impact or increased certainty about choosing abortion. Two women reported that viewing helped them decide to continue the pregnancy; both also described preexisting decision uncertainty. Conclusions This law caused an increase in viewing rates and a statistically significant but small increase in continuing pregnancy rates. However, the majority of women were certain of their abortion decision and the law did not change their decision. Other factors were more significant in women’s decision-making, suggesting evaluations of restrictive laws should take account of the broader social environment. PMID:28746377
Rand, Cynthia M; Vincelli, Phyllis; Goldstein, Nicolas P N; Blumkin, Aaron; Szilagyi, Peter G
2017-01-01
To assess the effect of phone or text message reminders to parents of adolescents on human papillomavirus (HPV) vaccine series completion in Rochester, NY. We performed parallel randomized controlled trials of phone and text reminders for HPV vaccine for parents of 11- to 17-year olds in three urban primary care clinics. The main outcome measures were time to receipt of the third dose of HPV vaccine and HPV vaccination rates. We enrolled 178 phone intervention (180 control) and 191 text intervention (200 control) participants. In multivariate survival analysis controlling for gender, age, practice, insurance, race, and ethnicity, the time from enrollment to receipt of the third HPV dose for those receiving a phone reminder compared with controls was not significant overall (hazard ratio [HR] = 1.30, p = .12) but was for those enrolling at dose 1 (HR = 1.91, p = .007). There was a significant difference in those receiving a text reminder compared with controls (HR = 2.34, p < .0001; an average of 71 days earlier). At the end of the study, 48% of phone intervention versus 40% of phone control (p = .34), and 49% of text intervention versus 30% of text control (p = .001) adolescents had received 3 HPV vaccine doses. In this urban population of parents of adolescents, text message reminders for HPV vaccine completion for those who had already started the series were effective, whereas phone message reminders were only effective for those enrolled at dose 1. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuo, Yi-Ming; Lin, Hsing-Juh
2010-01-01
We examined environmental factors which are most responsible for the 8-year temporal dynamics of the intertidal seagrass Thalassia hemprichii in southern Taiwan. A dynamic factor analysis (DFA), a dimension-reduction technique, was applied to identify common trends in a multivariate time series and the relationships between this series and interacting environmental variables. The results of dynamic factor models (DFMs) showed that the leaf growth rate of the seagrass was mainly influenced by salinity (Sal), tidal range (TR), turbidity ( K), and a common trend representing an unexplained variability in the observed time series. Sal was the primary variable that explained the temporal dynamics of the leaf growth rate compared to TR and K. K and TR had larger influences on the leaf growth rate in low- than in high-elevation beds. In addition to K, TR, and Sal, UV-B radiation (UV-B), sediment depth (SD), and a common trend accounted for long-term temporal variations of the above-ground biomass. Thus, K, TR, Sal, UV-B, and SD are the predominant environmental variables that described temporal growth variations of the intertidal seagrass T. hemprichii in southern Taiwan. In addition to environmental variables, human activities may be contributing to negative impacts on the seagrass beds; this human interference may have been responsible for the unexplained common trend in the DFMs. Due to successfully applying the DFA to analyze complicated ecological and environmental data in this study, important environmental variables and impacts of human activities along the coast should be taken into account when managing a coastal environment for the conservation of intertidal seagrass beds.
NASA Astrophysics Data System (ADS)
Vrac, Mathieu
2018-06-01
Climate simulations often suffer from statistical biases with respect to observations or reanalyses. It is therefore common to correct (or adjust) those simulations before using them as inputs into impact models. However, most bias correction (BC) methods are univariate and so do not account for the statistical dependences linking the different locations and/or physical variables of interest. In addition, they are often deterministic, and stochasticity is frequently needed to investigate climate uncertainty and to add constrained randomness to climate simulations that do not possess a realistic variability. This study presents a multivariate method of rank resampling for distributions and dependences (R2D2) bias correction allowing one to adjust not only the univariate distributions but also their inter-variable and inter-site dependence structures. Moreover, the proposed R2D2 method provides some stochasticity since it can generate as many multivariate corrected outputs as the number of statistical dimensions (i.e., number of grid cell × number of climate variables) of the simulations to be corrected. It is based on an assumption of stability in time of the dependence structure - making it possible to deal with a high number of statistical dimensions - that lets the climate model drive the temporal properties and their changes in time. R2D2 is applied on temperature and precipitation reanalysis time series with respect to high-resolution reference data over the southeast of France (1506 grid cell). Bivariate, 1506-dimensional and 3012-dimensional versions of R2D2 are tested over a historical period and compared to a univariate BC. How the different BC methods behave in a climate change context is also illustrated with an application to regional climate simulations over the 2071-2100 period. The results indicate that the 1d-BC basically reproduces the climate model multivariate properties, 2d-R2D2 is only satisfying in the inter-variable context, 1506d-R2D2 strongly improves inter-site properties and 3012d-R2D2 is able to account for both. Applications of the proposed R2D2 method to various climate datasets are relevant for many impact studies. The perspectives of improvements are numerous, such as introducing stochasticity in the dependence itself, questioning its stability assumption, and accounting for temporal properties adjustment while including more physics in the adjustment procedures.
Improving Calculus II and III through the Redistribution of Topics
ERIC Educational Resources Information Center
George, C. Yousuf; Koetz, Matt; Lewis, Heather A.
2016-01-01
Three years ago our mathematics department rearranged the topics in second and third semester calculus, moving multivariable calculus to the second semester and series to the third semester. This paper describes the new arrangement of topics, and how it could be adapted to calculus curricula at different schools. It also explains the benefits we…
NASA Astrophysics Data System (ADS)
Nicolae Lerma, A.; Bulteau, T.; Elineau, S.; Paris, F.; Pedreros, R.
2016-12-01
Marine submersion is an increasing concern for coastal cities as urban development reinforces their vulnerabilities while climate change is likely to foster the frequency and magnitude of submersions. Characterising the coastal flooding hazard is therefore of paramount importance to ensure the security of people living in such places and for coastal planning. A hazard is commonly defined as an adverse phenomenon, often represented by a magnitude of a variable of interest (e.g. flooded area), hereafter called response variable, associated with a probability of exceedance or, alternatively, a return period. Characterising the coastal flooding hazard consists in finding the correspondence between the magnitude and the return period. The difficulty lies in the fact that the assessment is usually performed using physical numerical models taking as inputs scenarios composed by multiple forcing conditions that are most of the time interdependent. Indeed, a time series of the response variable is usually not available so we have to deal instead with time series of forcing variables (e.g. water level, waves). Thus, the problem is twofold: on the one hand, the definition of scenarios is a multivariate matter; on the other hand, it is tricky and approximate to associate the resulting response, being the output of the physical numerical model, to the return period defined for the scenarios. In this study, we illustrate the problem on the district of Leucate, located in the French Mediterranean coast. A multivariate extreme value analysis of waves and water levels is performed offshore using a conditional extreme model, then two different methods are used to define and select 100-year scenarios of forcing variables: one based on joint exceedance probability contours, a method classically used in coastal risks studies, the other based on environmental contours, which are commonly used in the field of structure design engineering. We show that these two methods enable one to frame the true 100-year response variable. The selected scenarios are propagated to the shore through a high resolution flood modelling coupling overflowing and overtopping processes. Results in terms of inundated areas and inland water volumes are finally compared for the two methods, giving upper and lower bounds for the true response variables.
Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction.
Soleimani, Hossein; Hensman, James; Saria, Suchi
2017-08-21
Missing data and noisy observations pose significant challenges for reliably predicting events from irregularly sampled multivariate time series (longitudinal) data. Imputation methods, which are typically used for completing the data prior to event prediction, lack a principled mechanism to account for the uncertainty due to missingness. Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations. These approaches, however, make strong parametric assumptions and do not easily scale to multivariate signals with many observations. Our proposed approach consists of several key innovations. First, we develop a flexible and scalable joint model based upon sparse multiple-output Gaussian processes. Unlike state-of-the-art joint models, the proposed model can explain highly challenging structure including non-Gaussian noise while scaling to large data. Second, we derive an optimal policy for predicting events using the distribution of the event occurrence estimated by the joint model. The derived policy trades-off the cost of a delayed detection versus incorrect assessments and abstains from making decisions when the estimated event probability does not satisfy the derived confidence criteria. Experiments on a large dataset show that the proposed framework significantly outperforms state-of-the-art techniques in event prediction.
Relationship between cataract severity and socioeconomic status.
Wesolosky, Jason D; Rudnisky, Christopher J
2013-12-01
To determine the relationship between cataract severity and socioeconomic status (SES). Retrospective, observational case series. A total of 1350 eyes underwent phacoemulsification cataract extraction by a single surgeon using an Alcon Infiniti system. Cataract severity was measured using phaco time in seconds. SES was measured using area-level aggregate census data: median income, education, proportion of common-law couples, and employment rate. Preoperative best corrected visual acuity was obtained and converted to logarithm of the minimum angle of resolution values. For patients undergoing bilateral surgery, the generalized estimating equation was used to account for the correlation between eyes. Univariate analyses were performed using simple regression, and multivariate analyses were performed to account for variables with significant relationships (p < 0.05) on univariate testing. Sensitivity analyses were performed to assess the effect of including patient age in the controlled analyses. Multivariate analyses demonstrated that cataracts were more severe when the median income was lower (p = 0.001) and the proportion of common-law couples living in a patient's community (p = 0.012) and the unemployment rate (p = 0.002) were higher. These associations persisted even when controlling for patient age. Patients of lower SES have more severe cataracts. Copyright © 2013 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
Insufficient sleep predicts clinical burnout.
Söderström, Marie; Jeding, Kerstin; Ekstedt, Mirjam; Perski, Aleksander; Akerstedt, Torbjörn
2012-04-01
The present prospective study aimed to identify risk factors for subsequent clinical burnout. Three hundred eighty-eight working individuals completed a baseline questionnaire regarding work stress, sleep, mood, health, and so forth. During a 2-year period, 15 subjects (7 women and 8 men) of the total sample were identified as "burnout cases," as they were assessed and referred to treatment for clinical burnout. Questionnaire data from the baseline measurement were used as independent variables in a series of logistic regression analyses to predict clinical burnout. The results identified "too little sleep (< 6 h)" as the main risk factor for burnout development, with adjustment for "work demands," "thoughts of work during leisure time," and "sleep quality." The first two factors were significant predictors in earlier steps of the multivariate regression. The results indicate that insufficient sleep, preoccupation with thoughts of work during leisure time, and high work demands are risk factors for subsequent burnout. The results suggest a chain of causation. PsycINFO Database Record (c) 2012 APA, all rights reserved.
Eastin, Matthew D.; Delmelle, Eric; Casas, Irene; Wexler, Joshua; Self, Cameron
2014-01-01
Dengue fever transmission results from complex interactions between the virus, human hosts, and mosquito vectors—all of which are influenced by environmental factors. Predictive models of dengue incidence rate, based on local weather and regional climate parameters, could benefit disease mitigation efforts. Time series of epidemiological and meteorological data for the urban environment of Cali, Colombia are analyzed from January of 2000 to December of 2011. Significant dengue outbreaks generally occur during warm-dry periods with extreme daily temperatures confined between 18°C and 32°C—the optimal range for mosquito survival and viral transmission. Two environment-based, multivariate, autoregressive forecast models are developed that allow dengue outbreaks to be anticipated from 2 weeks to 6 months in advance. These models have the potential to enhance existing dengue early warning systems, ultimately supporting public health decisions on the timing and scale of vector control efforts. PMID:24957546
Forecasting the stochastic demand for inpatient care: the case of the Greek national health system.
Boutsioli, Zoe
2010-08-01
The aim of this study is to estimate the unexpected demand of Greek public hospitals. A multivariate model with four explanatory variables is used. These are as follows: the weekend effect, the duty effect, the summer holiday and the official holiday. The method of the ordinary least squares is used to estimate the impact of these variables on the daily hospital emergency admissions series. The forecasted residuals of hospital regressions for each year give the estimated stochastic demand. Daily emergency admissions decline during weekends, summer months and official holidays, and increase on duty hospital days. Stochastic hospital demand varies both among hospitals and over the five-year time period under investigation. Variations among hospitals are larger than time variations. Hospital managers and health policy-makers can be availed by forecasting the future flows of emergent patients. The benefit can be both at managerial and economical level. More advanced models including additional daily variables such as the weather forecasts could provide more accurate estimations.
Characterization of spatial and temporal variability in hydrochemistry of Johor Straits, Malaysia.
Abdullah, Pauzi; Abdullah, Sharifah Mastura Syed; Jaafar, Othman; Mahmud, Mastura; Khalik, Wan Mohd Afiq Wan Mohd
2015-12-15
Characterization of hydrochemistry changes in Johor Straits within 5 years of monitoring works was successfully carried out. Water quality data sets (27 stations and 19 parameters) collected in this area were interpreted subject to multivariate statistical analysis. Cluster analysis grouped all the stations into four clusters ((Dlink/Dmax) × 100<90) and two clusters ((Dlink/Dmax) × 100<80) for site and period similarities. Principal component analysis rendered six significant components (eigenvalue>1) that explained 82.6% of the total variance of the data set. Classification matrix of discriminant analysis assigned 88.9-92.6% and 83.3-100% correctness in spatial and temporal variability, respectively. Times series analysis then confirmed that only four parameters were not significant over time change. Therefore, it is imperative that the environmental impact of reclamation and dredging works, municipal or industrial discharge, marine aquaculture and shipping activities in this area be effectively controlled and managed. Copyright © 2015 Elsevier Ltd. All rights reserved.
The demand for distilled spirits: an empirical investigation.
McCornac, D C; Filante, R W
1984-03-01
Economic and social factors that explain variations in the consumption of distilled spirits among political jurisdictions are examined. Particular emphasis is placed on the economic roles of price and the unemployment rate. Using multivariate-analysis regression, equations are estimated for three separate time periods of 1970-1975. In addition, a pooled cross-sectional time-series analysis is undertaken for the entire time period. The dependent variable is the apparent per capita consumption of distilled spirits. The independent variables include price, availability and socioeconomic factors that determine consumption patterns. The results indicate that the price elasticity of demand for distilled spirits inelastic, and implies that a 1% change in price will result in a less than 1% change in the amount purchased, everything else being equal. A rise in price will increase total revenue. Thus, a tax increase on the commodity will generate an increase in tax revenue. The unemployment rate is shown to have a significant impact on the consumption of distilled spirits. The results suggest that further study into the relationship between unemployment and the consumption of distilled spirits is desirable.
The Gaussian copula model for the joint deficit index for droughts
NASA Astrophysics Data System (ADS)
Van de Vyver, H.; Van den Bergh, J.
2018-06-01
The characterization of droughts and their impacts is very dependent on the time scale that is involved. In order to obtain an overall drought assessment, the cumulative effects of water deficits over different times need to be examined together. For example, the recently developed joint deficit index (JDI) is based on multivariate probabilities of precipitation over various time scales from 1- to 12-months, and was constructed from empirical copulas. In this paper, we examine the Gaussian copula model for the JDI. We model the covariance across the temporal scales with a two-parameter function that is commonly used in the specific context of spatial statistics or geostatistics. The validity of the covariance models is demonstrated with long-term precipitation series. Bootstrap experiments indicate that the Gaussian copula model has advantages over the empirical copula method in the context of drought severity assessment: (i) it is able to quantify droughts outside the range of the empirical copula, (ii) provides adequate drought quantification, and (iii) provides a better understanding of the uncertainty in the estimation.
The influence of economic business cycles on United States suicide rates.
Wasserman, I M
1984-01-01
A number of social science investigators have shown that a downturn in the economy leads to an increase in the suicide rate. However, the previous works on the subject are flawed by the fact that they employ years as their temporal unit of analysis. This time period is so large that it makes it difficult for investigators to precisely determine the length of the lag effect, while at the same time removing the autocorrelation effects. Also, although most works on suicide and the business cycle employ unemployment as a measure of a downturn in the business cycle, the average duration of unemployment represents a better measure for determining the social impact of an economic downturn. From 1947 to 1977 the average monthly duration of unemployment is statistically related to the suicide rate using multivariate time-series analysis. From 1910 to 1939 the Ayres business index, a surrogate measure for movement in the business cycle, is statistically related to the monthly suicide rate. An examination of the findings confirms that in most cases a downturn in the economy causes an increase in the suicide rate.
Quezada Loaiza, Carlos Andrés; Velázquez Martín, María Teresa; Jiménez López-Guarch, Carmen; Ruiz Cano, María José; Navas Tejedor, Paula; Carreira, Patricia Esmeralda; Flox Camacho, Ángela; de Pablo Gafas, Alicia; Delgado Jiménez, Juan Francisco; Gómez Sánchez, Miguel Ángel; Escribano Subías, Pilar
2017-11-01
Pulmonary arterial hypertension (PAH) is characterized by increased pulmonary vascular resistance, right ventricular dysfunction and death. Despite scientific advances, is still associated with high morbidity and mortality. The aim is to describe the clinical approach and determine the prognostic factors of patients with PAH treated in a national reference center over 30 years. Three hundred and seventy nine consecutive patients with PAH (January 1984 to December 2014) were studied. Were divided into 3 periods of time: before 2004, 2004-2009 and 2010-2014. Prognostic factors (multivariate analysis) were analyzed for clinical deterioration. Median age was 44 years (68.6% women), functional class III-IV: 72%. An increase was observed in more complex etiologies in the last period of time: Pulmonary venooclusive disease and portopulmonary hypertension. Upfront combination therapy significantly increased (5% before 2004 vs 27% after 2010; P < .05). Multivariate analysis showed prognostic significance in age, sex, etiology and combined clinical variables as they are independent predictors of clinical deterioration (P < .05). Survival free from death or transplantation for the 1st, 3rd and 5th year was 92.2%, 80.6% and 68.5% respectively. The median survival was 9 years (95% confidence interval, 7.532-11.959) CONCLUSIONS: The PAH is a heterogeneous and complex disease, the median survival free from death or transplantation in our series is 9 years after diagnosis. The structure of a multidisciplinary unit PAH must adapt quickly to changes that occur over time incorporating new diagnostic and therapeutic techniques. Copyright © 2017 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.
A regressive methodology for estimating missing data in rainfall daily time series
NASA Astrophysics Data System (ADS)
Barca, E.; Passarella, G.
2009-04-01
The "presence" of gaps in environmental data time series represents a very common, but extremely critical problem, since it can produce biased results (Rubin, 1976). Missing data plagues almost all surveys. The problem is how to deal with missing data once it has been deemed impossible to recover the actual missing values. Apart from the amount of missing data, another issue which plays an important role in the choice of any recovery approach is the evaluation of "missingness" mechanisms. When data missing is conditioned by some other variable observed in the data set (Schafer, 1997) the mechanism is called MAR (Missing at Random). Otherwise, when the missingness mechanism depends on the actual value of the missing data, it is called NCAR (Not Missing at Random). This last is the most difficult condition to model. In the last decade interest arose in the estimation of missing data by using regression (single imputation). More recently multiple imputation has become also available, which returns a distribution of estimated values (Scheffer, 2002). In this paper an automatic methodology for estimating missing data is presented. In practice, given a gauging station affected by missing data (target station), the methodology checks the randomness of the missing data and classifies the "similarity" between the target station and the other gauging stations spread over the study area. Among different methods useful for defining the similarity degree, whose effectiveness strongly depends on the data distribution, the Spearman correlation coefficient was chosen. Once defined the similarity matrix, a suitable, nonparametric, univariate, and regressive method was applied in order to estimate missing data in the target station: the Theil method (Theil, 1950). Even though the methodology revealed to be rather reliable an improvement of the missing data estimation can be achieved by a generalization. A first possible improvement consists in extending the univariate technique to the multivariate approach. Another approach follows the paradigm of the "multiple imputation" (Rubin, 1987; Rubin, 1988), which consists in using a set of "similar stations" instead than the most similar. This way, a sort of estimation range can be determined allowing the introduction of uncertainty. Finally, time series can be grouped on the basis of monthly rainfall rates defining classes of wetness (i.e.: dry, moderately rainy and rainy), in order to achieve the estimation using homogeneous data subsets. We expect that integrating the methodology with these enhancements will certainly improve its reliability. The methodology was applied to the daily rainfall time series data registered in the Candelaro River Basin (Apulia - South Italy) from 1970 to 2001. REFERENCES D.B., Rubin, 1976. Inference and Missing Data. Biometrika 63 581-592 D.B. Rubin, 1987. Multiple Imputation for Nonresponce in Surveys, New York: John Wiley & Sons, Inc. D.B. Rubin, 1988. An overview of multiple imputation. In Survey Research Section, pp. 79-84, American Statistical Association, 1988. J.L., Schafer, 1997. Analysis of Incomplete Multivariate Data, Chapman & Hall. J., Scheffer, 2002. Dealing with Missing Data. Res. Lett. Inf. Math. Sci. 3, 153-160. Available online at http://www.massey.ac.nz/~wwiims/research/letters/ H. Theil, 1950. A rank-invariant method of linear and polynomial regression analysis. Indicationes Mathematicae, 12, pp.85-91.
Identification of factors affecting birth rate in Czech Republic
NASA Astrophysics Data System (ADS)
Zámková, Martina; Blašková, Veronika
2013-10-01
This article is concerned with identifying economic factors primarily that affect birth rates in Czech Republic. To find the relationship between the magnitudes, we used the multivariate regression analysis and for modeling, we used a time series of annual values (1994-2011) both economic indicators and indicators related to demographics. Due to potential problems with apparent dependence we first cleansed all series obtained from the Czech Statistical Office using first differences. It is clear from the final model that meets all assumptions that there is a positive correlation between birth rates and the financial situation of households. We described the financial situation of households by GDP per capita, gross wages and consumer price index. As expected a positive correlation was proved for GDP per capita and gross wages and negative dependence was proved for the consumer price index. In addition to these economic variables in the model there were used also demographic characteristics of the workforce and the number of employed people. It can be stated that if the Czech Republic wants to support an increase in the birth rate, it is necessary to consider the financial support for households with small children.
NASA Astrophysics Data System (ADS)
Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.
2014-12-01
A critical point in the analysis of ground displacements time series is the development of data driven methods that allow to discern and characterize the different sources that generate the observed displacements. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows to reduce the dimensionality of the data space maintaining most of the variance of the dataset explained. It reproduces the original data using a limited number of Principal Components, but it also shows some deficiencies. Indeed, PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem, i.e. in recovering and separating the original sources that generated the observed data. This is mainly due to the assumptions on which PCA relies: it looks for a new Euclidean space where the projected data are uncorrelated. Usually, the uncorrelation condition is not strong enough and it has been proven that the BSS problem can be tackled imposing on the components to be independent. The Independent Component Analysis (ICA) is, in fact, another popular technique adopted to approach this problem, and it can be used in all those fields where PCA is also applied. An ICA approach enables us to explain the time series imposing a fewer number of constraints on the model, and to reveal anomalies in the data such as transient signals. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here we present the application of the vbICA technique to GPS position time series. First, we use vbICA on synthetic data that simulate a seismic cycle (interseismic + coseismic + postseismic + seasonal + noise), and study the ability of the algorithm to recover the original (known) sources of deformation. Secondly, we apply vbICA to different tectonically active scenarios, such as earthquakes in central and northern Italy, as well as the study of slow slip events in Cascadia.
Prognostic Significance of POLE Proofreading Mutations in Endometrial Cancer
Church, David N.; Stelloo, Ellen; Nout, Remi A.; Valtcheva, Nadejda; Depreeuw, Jeroen; ter Haar, Natalja; Noske, Aurelia; Amant, Frederic; Wild, Peter J.; Lambrechts, Diether; Jürgenliemk-Schulz, Ina M.; Jobsen, Jan J.; Smit, Vincent T. H. B. M.; Creutzberg, Carien L.; Bosse, Tjalling
2015-01-01
Background: Current risk stratification in endometrial cancer (EC) results in frequent over- and underuse of adjuvant therapy, and may be improved by novel biomarkers. We examined whether POLE proofreading mutations, recently reported in about 7% of ECs, predict prognosis. Methods: We performed targeted POLE sequencing in ECs from the PORTEC-1 and -2 trials (n = 788), and analyzed clinical outcome according to POLE status. We combined these results with those from three additional series (n = 628) by meta-analysis to generate multivariable-adjusted, pooled hazard ratios (HRs) for recurrence-free survival (RFS) and cancer-specific survival (CSS) of POLE-mutant ECs. All statistical tests were two-sided. Results: POLE mutations were detected in 48 of 788 (6.1%) ECs from PORTEC-1 and-2 and were associated with high tumor grade (P < .001). Women with POLE-mutant ECs had fewer recurrences (6.2% vs 14.1%) and EC deaths (2.3% vs 9.7%), though, in the total PORTEC cohort, differences in RFS and CSS were not statistically significant (multivariable-adjusted HR = 0.43, 95% CI = 0.13 to 1.37, P = .15; HR = 0.19, 95% CI = 0.03 to 1.44, P = .11 respectively). However, of 109 grade 3 tumors, 0 of 15 POLE-mutant ECs recurred, compared with 29 of 94 (30.9%) POLE wild-type cancers; reflected in statistically significantly greater RFS (multivariable-adjusted HR = 0.11, 95% CI = 0.001 to 0.84, P = .03). In the additional series, there were no EC-related events in any of 33 POLE-mutant ECs, resulting in a multivariable-adjusted, pooled HR of 0.33 for RFS (95% CI = 0.12 to 0.91, P = .03) and 0.26 for CSS (95% CI = 0.06 to 1.08, P = .06). Conclusion: POLE proofreading mutations predict favorable EC prognosis, independently of other clinicopathological variables, with the greatest effect seen in high-grade tumors. This novel biomarker may help to reduce overtreatment in EC. PMID:25505230
Multivariate analysis applied to monthly rainfall over Rio de Janeiro state, Brazil
NASA Astrophysics Data System (ADS)
Brito, Thábata T.; Oliveira-Júnior, José F.; Lyra, Gustavo B.; Gois, Givanildo; Zeri, Marcelo
2017-10-01
Spatial and temporal patterns of rainfall were identified over the state of Rio de Janeiro, southeast Brazil. The proximity to the coast and the complex topography create great diversity of rainfall over space and time. The dataset consisted of time series (1967-2013) of monthly rainfall over 100 meteorological stations. Clustering analysis made it possible to divide the stations into six groups (G1, G2, G3, G4, G5 and G6) with similar rainfall spatio-temporal patterns. A linear regression model was applied to a time series and a reference. The reference series was calculated from the average rainfall within a group, using nearby stations with higher correlation (Pearson). Based on t-test ( p < 0.05) all stations had a linear spatiotemporal trend. According to the clustering analysis, the first group (G1) contains stations located over the coastal lowlands and also over the ocean facing area of Serra do Mar (Sea ridge), a 1500 km long mountain range over the coastal Southeastern Brazil. The second group (G2) contains stations over all the state, from Serra da Mantiqueira (Mantiqueira Mountains) and Costa Verde (Green coast), to the south, up to stations in the Northern parts of the state. Group 3 (G3) contains stations in the highlands over the state (Serrana region), while group 4 (G4) has stations over the northern areas and the continent-facing side of Serra do Mar. The last two groups were formed with stations around Paraíba River (G5) and the metropolitan area of the city of Rio de Janeiro (G6). The driest months in all regions were June, July and August, while November, December and January were the rainiest months. Sharp transitions occurred when considering monthly accumulated rainfall: from January to February, and from February to March, likely associated with episodes of "veranicos", i.e., periods of 4-15 days of duration with no rainfall.
NASA Astrophysics Data System (ADS)
Yuan, Naiming; Xoplaki, Elena; Zhu, Congwen; Luterbacher, Juerg
2016-06-01
In this paper, two new methods, Temporal evolution of Detrended Cross-Correlation Analysis (TDCCA) and Temporal evolution of Detrended Partial-Cross-Correlation Analysis (TDPCCA), are proposed by generalizing DCCA and DPCCA. Applying TDCCA/TDPCCA, it is possible to study correlations on multi-time scales and over different periods. To illustrate their properties, we used two climatological examples: i) Global Sea Level (GSL) versus North Atlantic Oscillation (NAO); and ii) Summer Rainfall over Yangtze River (SRYR) versus previous winter Pacific Decadal Oscillation (PDO). We find significant correlations between GSL and NAO on time scales of 60 to 140 years, but the correlations are non-significant between 1865-1875. As for SRYR and PDO, significant correlations are found on time scales of 30 to 35 years, but the correlations are more pronounced during the recent 30 years. By combining TDCCA/TDPCCA and DCCA/DPCCA, we proposed a new correlation-detection system, which compared to traditional methods, can objectively show how two time series are related (on which time scale, during which time period). These are important not only for diagnosis of complex system, but also for better designs of prediction models. Therefore, the new methods offer new opportunities for applications in natural sciences, such as ecology, economy, sociology and other research fields.
Time-frequency featured co-movement between the stock and prices of crude oil and gold
NASA Astrophysics Data System (ADS)
Huang, Shupei; An, Haizhong; Gao, Xiangyun; Huang, Xuan
2016-02-01
The nonlinear relationships among variables caused by the hidden frequency information complicate the time series analysis. To shed more light on this nonlinear issue, we examine their relationships in joint time-frequency domain with multivariate framework, and the analyses in the time domain and frequency domain serve as comparisons. The daily Brent oil prices, London gold fixing price and Shanghai Composite index from January 1991 to September 2014 are adopted as example. First, they have long-term cointegration relationship in time domain from holistic perspective. Second, the Granger causality tests in different frequency bands are heterogeneous. Finally, the comparison between results from wavelet coherence and multiple wavelet coherence in the joint time-frequency domain indicates that in the high (1-14 days) and medium frequency (14-128 days) bands, the combination of Brent and gold prices has stronger correlation with the stock. In the low frequency band (256-512 days), year 2003 is the structure broken point before which Brent and oil are ideal choice for hedging the risk of the stock market. Thus, this paper offers more details between the Chinese stock market and the commodities markets of crude oil and gold, which suggests that the decisions for different time and frequencies should consider the corresponding benchmark information.
NASA Astrophysics Data System (ADS)
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Spatial estimation from remotely sensed data via empirical Bayes models
NASA Technical Reports Server (NTRS)
Hill, J. R.; Hinkley, D. V.; Kostal, H.; Morris, C. N.
1984-01-01
Multichannel satellite image data, available as LANDSAT imagery, are recorded as a multivariate time series (four channels, multiple passovers) in two spatial dimensions. The application of parametric empirical Bayes theory to classification of, and estimating the probability of, each crop type at each of a large number of pixels is considered. This theory involves both the probability distribution of imagery data, conditional on crop types, and the prior spatial distribution of crop types. For the latter Markov models indexed by estimable parameters are used. A broad outline of the general theory reveals several questions for further research. Some detailed results are given for the special case of two crop types when only a line transect is analyzed. Finally, the estimation of an underlying continuous process on the lattice is discussed which would be applicable to such quantities as crop yield.
Correlates of Illicit Drug Use Among Indigenous Peoples in Canada: A Test of Social Support Theory.
Cao, Liqun; Burton, Velmer S; Liu, Liu
2018-02-01
Relying on a national stratified random sample of Indigenous peoples aged 19 years old and above in Canada, this study investigates the correlates of illicit drug use among Indigenous peoples, paying special attention to the association between social support measures and illegal drug use. Results from multivariate logistical regression show that measures of social support, such as residential mobility, strength of ties within communities, and lack of timely counseling, are statistically significant correlates of illicit drug use. Those identifying as Christian are significantly less likely to use illegal drugs. This is the first nationwide analysis of the illicit drug usage of Indigenous peoples in Canada. The results are robust because we have controlled for a range of comorbidity variables as well as a series of sociodemographic variables. Policy implications from these findings are discussed.
Multivariate analysis in thoracic research.
Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego
2015-03-01
Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
Time Series of Greenland Ice-Sheet Elevations and Mass Changes from ICESat 2003-2009
NASA Astrophysics Data System (ADS)
Zwally, H. J.; Li, J.; Medley, B.; Robbins, J. W.; Yi, D.
2015-12-01
We follow the repeat-track analysis (RTA) of ICESat surface-elevation data by a second stage that adjusts the measured elevations on repeat passes to the reference track taking into account the cross-track slope (αc), in order to construct elevation time series. αc are obtained from RTA simultaneous solutions for αc, dh/dt, and h0. The height measurements on repeat tracks are initially interpolated to uniform along-track reference points (every 172 m) and times (ti) giving the h(xi,ti) used in the RTA solutions. The xi are the cross-track spacings from the reference track and i is the laser campaign index. The adjusted elevation measurements at the along-track reference points are hr(ti) = h(xi,ti) - xi tan(αc) - h0. The hr(ti) time series are averaged over 50 km cells creating H(ti) series and further averaged (weighted by cell area) to H(t) time series over drainage systems (DS), elevation bands, regions, and the entire ice sheet. Temperature-driven changes in the rate of firn compaction, CT(t), are calculated for 50 km cells with our firn-compaction model giving I(t) = H(t) - CT(t) - B(t) where B(t) is the vertical motion of the bedrock. During 2003 to 2009, the average dCT(t)/dt in the accumulation zone is -5 cm/yr, which amounts to a -75 km3/yr correction to ice volume change estimates. The I(t) are especially useful for studying the seasonal cycle of mass gains and losses and interannual variations. The H(t) for the ablation zone are fitted with a multi-variate function with a linear component describing the upward component of ice flow plus winter accumulation (fall through spring) and a portion of a sine function describing the superimposed summer melting. During fall to spring the H(t) indicate that the upward motion of the ice flow is at a rate of 1 m/yr, giving an annual mass gain of 180 Gt/yr in the ablation zone. The summer loss from surface melting in the high-melt summer of 2005 is 350 Gt/yr, giving a net surface loss of 170 Gt/yr from the ablation zone for 2005. During 2003-2008, the H(t) for the ablation zone show accelerations of the mass losses in the northwest DS8 and in the west-central DS7 (including Jacobshavn glacier) and offsetting decelerations of the mass losses in the east-central DS3 and southeast DS4, much of which occurred in 2008 possibly due to an eastward shift in the surface mass balance.
Multivariate statistics as means of tracking atmospheric pollution trends in Western Poland.
Astel, Aleksander M; Walna, Barbara; Simeonov, Vasil; Kurzyca, Iwona
2008-02-15
This study was carried out over a period of 4 years (2002-2005) at 2 sites located in western Poland differing as regards to human impact by analysis of chemical composition of bulk precipitation. The aim of the study was to determine the sources of pollutions and assess their quantitative contribution to the bulk precipitation composition and to analyse long term-changes in the chemical quality of precipitation. Based on this information the possible transboundary impacts of pollution were also determined. The samples were characterized by determining the values of pH, electrolytic conductivity and concentration levels of Cl(-), F(-), SO(4)(2-), NO(3)(-), Na(+), K(+), Mg(2+), Ca(2+) and NH(4)(+). Analytical measurements were connected with application of principal component regression (PCR) and time series analysis (TS). Based on PCR results three major sources of pollutants in central part of Poland have been identified and quantitatively assessed as follows: "combined" (Poznań - 31%, WNP - 32%), "soil-particulates" (Poznań - 2%, WNP - 26%), "anthropogenic-fossil fuels" (Poznań - 43%, WNP - 23%). Time series analysis enabled discovering 12-month time cycle for NO(3)(-), NH(4)(+), Cl(-), F(-) and SO(4)(2-) in average monthly concentration values in bulk precipitation collected in Wielkopolski National Park. Seasonal variation in the emission of precursors of NO(3)(-) and NH(4)(+) was caused by changes in intensity of fertilizer application in agriculture and automobile exhaust emissions. Decreasing trend was visible for sulphates, nitrates, chlorides and fluorides which is an important indication of the acid rain reduction in the ecologically protected area and in Poznań.
NASA Astrophysics Data System (ADS)
Karbon, Maria; Heinkelmann, Robert; Mora-Diaz, Julian; Xu, Minghui; Nilsson, Tobias; Schuh, Harald
2017-07-01
The radio sources within the most recent celestial reference frame (CRF) catalog ICRF2 are represented by a single, time-invariant coordinate pair. The datum sources were chosen mainly according to certain statistical properties of their position time series. Yet, such statistics are not applicable unconditionally, and also ambiguous. However, ignoring systematics in the source positions of the datum sources inevitably leads to a degradation of the quality of the frame and, therefore, also of the derived quantities such as the Earth orientation parameters. One possible approach to overcome these deficiencies is to extend the parametrization of the source positions, similarly to what is done for the station positions. We decided to use the multivariate adaptive regression splines algorithm to parametrize the source coordinates. It allows a great deal of automation, by combining recursive partitioning and spline fitting in an optimal way. The algorithm finds the ideal knot positions for the splines and, thus, the best number of polynomial pieces to fit the data autonomously. With that we can correct the ICRF2 a priori coordinates for our analysis and eliminate the systematics in the position estimates. This allows us to introduce also special handling sources into the datum definition, leading to on average 30 % more sources in the datum. We find that not only the CPO can be improved by more than 10 % due to the improved geometry, but also the station positions, especially in the early years of VLBI, can benefit greatly.
Cardiorespiratory dynamic response to mental stress: a multivariate time-frequency analysis.
Widjaja, Devy; Orini, Michele; Vlemincx, Elke; Van Huffel, Sabine
2013-01-01
Mental stress is a growing problem in our society. In order to deal with this, it is important to understand the underlying stress mechanisms. In this study, we aim to determine how the cardiorespiratory interactions are affected by mental arithmetic stress and attention. We conduct cross time-frequency (TF) analyses to assess the cardiorespiratory coupling. In addition, we introduce partial TF spectra to separate variations in the RR interval series that are linearly related to respiration from RR interval variations (RRV) that are not related to respiration. The performance of partial spectra is evaluated in two simulation studies. Time-varying parameters, such as instantaneous powers and frequencies, are derived from the computed spectra. Statistical analysis is carried out continuously in time to evaluate the dynamic response to mental stress and attention. The results show an increased heart and respiratory rate during stress and attention, compared to a resting condition. Also a fast reduction in vagal activity is noted. The partial TF analysis reveals a faster reduction of RRV power related to (3 s) than unrelated to (30 s) respiration, demonstrating that the autonomic response to mental stress is driven by mechanisms characterized by different temporal scales.
Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz
2005-01-01
We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...
SELECTION AND TRAINING, A SURVEY OF IOWA MANUFACTURING FIRMS. MONOGRAPH SERIES NO. 4.
ERIC Educational Resources Information Center
SHERIFF, DON R.; AND OTHERS
INFORMATION ON EMPLOYEE SELECTION AND TRAINING ACTIVITIES WAS SECURED FROM QUESTIONNAIRES RETURNED BY 215 OF 283 FIRMS EMPLOYING AT LEAST 100 PERSONS. DATA FROM 207 SEPARATE ITEMS FOR EACH FIRM WERE KEY PUNCHED AND TABULATED INTO MULTIVARIATE CROSS-CLASSIFICATIONS. OVER 60 PERCENT OF THE FIRMS WERE IN CITIES HAVING OVER 25,000 POPULATION, 40…
Wang, Xiuquan; Huang, Guohe; Zhao, Shan; Guo, Junhong
2015-09-01
This paper presents an open-source software package, rSCA, which is developed based upon a stepwise cluster analysis method and serves as a statistical tool for modeling the relationships between multiple dependent and independent variables. The rSCA package is efficient in dealing with both continuous and discrete variables, as well as nonlinear relationships between the variables. It divides the sample sets of dependent variables into different subsets (or subclusters) through a series of cutting and merging operations based upon the theory of multivariate analysis of variance (MANOVA). The modeling results are given by a cluster tree, which includes both intermediate and leaf subclusters as well as the flow paths from the root of the tree to each leaf subcluster specified by a series of cutting and merging actions. The rSCA package is a handy and easy-to-use tool and is freely available at http://cran.r-project.org/package=rSCA . By applying the developed package to air quality management in an urban environment, we demonstrate its effectiveness in dealing with the complicated relationships among multiple variables in real-world problems.
Chambon, Stanislas; Galtier, Mathieu N; Arnal, Pierrick J; Wainrib, Gilles; Gramfort, Alexandre
2018-04-01
Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30 s of the signal of a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEGs), electrooculograms (EOGs), electrocardiograms, and electromyograms (EMGs). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting handcrafted features, that exploits all multivariate and multimodal polysomnography (PSG) signals (EEG, EMG, and EOG), and that can exploit the temporal context of each 30-s window of data. For each modality, the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields the state-of-the-art performance. Our study reveals a number of insights on the spatiotemporal distribution of the signal of interest: a good tradeoff for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting 1 min of data before and after each data segment offers the strongest improvement when a limited number of channels are available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver the state-of-the-art classification performance with a small computational cost.
NASA Astrophysics Data System (ADS)
Wu, ShaoFei; Zhang, Xiang; She, DunXian
2017-06-01
Under the current condition of climate change, droughts and floods occur more frequently, and events in which flooding occurs after a prolonged drought or a drought occurs after an extreme flood may have a more severe impact on natural systems and human lives. This challenges the traditional approach wherein droughts and floods are considered separately, which may largely underestimate the risk of the disasters. In our study, the sudden alternation of droughts and flood events (ADFEs) between adjacent seasons is studied using the multivariate L-moments theory and the bivariate copula functions in the Huai River Basin (HRB) of China with monthly streamflow data at 32 hydrological stations from 1956 to 2012. The dry and wet conditions are characterized by the standardized streamflow index (SSI) at a 3-month time scale. The results show that: (1) The summer streamflow makes the largest contribution to the annual streamflow, followed by the autumn streamflow and spring streamflow. (2) The entire study area can be divided into five homogeneous sub-regions using the multivariate regional homogeneity test. The generalized logistic distribution (GLO) and log-normal distribution (LN3) are acceptable to be the optimal marginal distributions under most conditions, and the Frank copula is more appropriate for spring-summer and summer-autumn SSI series. Continuous flood events dominate at most sites both in spring-summer and summer-autumn (with an average frequency of 13.78% and 17.06%, respectively), while continuous drought events come second (with an average frequency of 11.27% and 13.79%, respectively). Moreover, seasonal ADFEs most probably occurred near the mainstream of HRB, and drought and flood events are more likely to occur in summer-autumn than in spring-summer.
A coupled weather generator - rainfall-runoff approach on hourly time steps for flood risk analysis
NASA Astrophysics Data System (ADS)
Winter, Benjamin; Schneeberger, Klaus; Dung Nguyen, Viet; Vorogushyn, Sergiy; Huttenlau, Matthias; Merz, Bruno; Stötter, Johann
2017-04-01
The evaluation of potential monetary damage of flooding is an essential part of flood risk management. One possibility to estimate the monetary risk is to analyze long time series of observed flood events and their corresponding damages. In reality, however, only few flood events are documented. This limitation can be overcome by the generation of a set of synthetic, physically and spatial plausible flood events and subsequently the estimation of the resulting monetary damages. In the present work, a set of synthetic flood events is generated by a continuous rainfall-runoff simulation in combination with a coupled weather generator and temporal disaggregation procedure for the study area of Vorarlberg (Austria). Most flood risk studies focus on daily time steps, however, the mesoscale alpine study area is characterized by short concentration times, leading to large differences between daily mean and daily maximum discharge. Accordingly, an hourly time step is needed for the simulations. The hourly metrological input for the rainfall-runoff model is generated in a two-step approach. A synthetic daily dataset is generated by a multivariate and multisite weather generator and subsequently disaggregated to hourly time steps with a k-Nearest-Neighbor model. Following the event generation procedure, the negative consequences of flooding are analyzed. The corresponding flood damage for each synthetic event is estimated by combining the synthetic discharge at representative points of the river network with a loss probability relation for each community in the study area. The loss probability relation is based on exposure and susceptibility analyses on a single object basis (residential buildings) for certain return periods. For these impact analyses official inundation maps of the study area are used. Finally, by analyzing the total event time series of damages, the expected annual damage or losses associated with a certain probability of occurrence can be estimated for the entire study area.
Multilingualism and fMRI: Longitudinal Study of Second Language Acquisition
Andrews, Edna; Frigau, Luca; Voyvodic-Casabo, Clara; Voyvodic, James; Wright, John
2013-01-01
BOLD fMRI is often used for the study of human language. However, there are still very few attempts to conduct longitudinal fMRI studies in the study of language acquisition by measuring auditory comprehension and reading. The following paper is the first in a series concerning a unique longitudinal study devoted to the analysis of bi- and multilingual subjects who are: (1) already proficient in at least two languages; or (2) are acquiring Russian as a second/third language. The focus of the current analysis is to present data from the auditory sections of a set of three scans acquired from April, 2011 through April, 2012 on a five-person subject pool who are learning Russian during the study. All subjects were scanned using the same protocol for auditory comprehension on the same General Electric LX 3T Signa scanner in Duke University Hospital. Using a multivariate analysis of covariance (MANCOVA) for statistical analysis, proficiency measurements are shown to correlate significantly with scan results in the Russian conditions over time. The importance of both the left and right hemispheres in language processing is discussed. Special attention is devoted to the importance of contextualizing imaging data with corresponding behavioral and empirical testing data using a multivariate analysis of variance. This is the only study to date that includes: (1) longitudinal fMRI data with subject-based proficiency and behavioral data acquired in the same time frame; and (2) statistical modeling that demonstrates the importance of covariate language proficiency data for understanding imaging results of language acquisition. PMID:24961428
Multilingualism and fMRI: Longitudinal Study of Second Language Acquisition.
Andrews, Edna; Frigau, Luca; Voyvodic-Casabo, Clara; Voyvodic, James; Wright, John
2013-05-28
BOLD fMRI is often used for the study of human language. However, there are still very few attempts to conduct longitudinal fMRI studies in the study of language acquisition by measuring auditory comprehension and reading. The following paper is the first in a series concerning a unique longitudinal study devoted to the analysis of bi- and multilingual subjects who are: (1) already proficient in at least two languages; or (2) are acquiring Russian as a second/third language. The focus of the current analysis is to present data from the auditory sections of a set of three scans acquired from April, 2011 through April, 2012 on a five-person subject pool who are learning Russian during the study. All subjects were scanned using the same protocol for auditory comprehension on the same General Electric LX 3T Signa scanner in Duke University Hospital. Using a multivariate analysis of covariance (MANCOVA) for statistical analysis, proficiency measurements are shown to correlate significantly with scan results in the Russian conditions over time. The importance of both the left and right hemispheres in language processing is discussed. Special attention is devoted to the importance of contextualizing imaging data with corresponding behavioral and empirical testing data using a multivariate analysis of variance. This is the only study to date that includes: (1) longitudinal fMRI data with subject-based proficiency and behavioral data acquired in the same time frame; and (2) statistical modeling that demonstrates the importance of covariate language proficiency data for understanding imaging results of language acquisition.
Hepatic resection for post-cholecystectomy bile duct injuries: a literature review.
Truant, Stéphanie; Boleslawski, Emmanuel; Lebuffe, Gilles; Sergent, Géraldine; Pruvot, François-René
2010-06-01
This study seeks to identify factors for hepatectomy in the management of post-cholecystectomy bile duct injury (BDI) and outcome via a systematic review of the literature. Relevant literature was found by searching the PubMed database and the bibliographies of extracted articles. To avoid bias selection, factors for hepatectomy were analysed in series reporting both patients undergoing hepatectomy and patients undergoing biliary repair without hepatectomy (bimodal treatment). Relevant variables were the presence or absence of additional hepatic artery and/or portal vein injury, the level of BDI, and a previous biliary repair. Among 460 potentially relevant publications, only 31 met the eligibility criteria. A total of 99 hepatectomies were reported among 1756 (5.6%) patients referred for post-cholecystectomy BDI. In eight series reporting bimodal treatment, including 232 patients, logistic regression multivariate analysis showed that hepatic arterial and Strasberg E4 and E5 injuries were independent factors associated with hepatectomy. Patients with combined arterial and Strasberg E4 or E5 injury were 43.3 times more likely to undergo hepatectomy (95% confidence interval 8.0-234.2) than patients without complex injury. Despite high postoperative morbidity, mortality rates were comparable with those of hepaticojejunostomy, except in urgent hepatectomies (within 2 weeks; four of nine patients died). Longterm outcome was satisfactory in 12 of 18 patients in the largest series. Hepatectomies were performed mainly in patients showing complex concurrent Strasberg E4 or E5 and hepatic arterial injury and provided satisfactory longterm outcomes despite high postoperative morbidity.
Hee, Siew Wan; Parsons, Nicholas; Stallard, Nigel
2018-03-01
The motivation for the work in this article is the setting in which a number of treatments are available for evaluation in phase II clinical trials and where it may be infeasible to try them concurrently because the intended population is small. This paper introduces an extension of previous work on decision-theoretic designs for a series of phase II trials. The program encompasses a series of sequential phase II trials with interim decision making and a single two-arm phase III trial. The design is based on a hybrid approach where the final analysis of the phase III data is based on a classical frequentist hypothesis test, whereas the trials are designed using a Bayesian decision-theoretic approach in which the unknown treatment effect is assumed to follow a known prior distribution. In addition, as treatments are intended for the same population it is not unrealistic to consider treatment effects to be correlated. Thus, the prior distribution will reflect this. Data from a randomized trial of severe arthritis of the hip are used to test the application of the design. We show that the design on average requires fewer patients in phase II than when the correlation is ignored. Correspondingly, the time required to recommend an efficacious treatment for phase III is quicker. © 2017 The Author. Biometrical Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
New Approach To Hour-By-Hour Weather Forecast
NASA Astrophysics Data System (ADS)
Liao, Q. Q.; Wang, B.
2017-12-01
Fine hourly forecast in single station weather forecast is required in many human production and life application situations. Most previous MOS (Model Output Statistics) which used a linear regression model are hard to solve nonlinear natures of the weather prediction and forecast accuracy has not been sufficient at high temporal resolution. This study is to predict the future meteorological elements including temperature, precipitation, relative humidity and wind speed in a local region over a relatively short period of time at hourly level. By means of hour-to-hour NWP (Numeral Weather Prediction)meteorological field from Forcastio (https://darksky.net/dev/docs/forecast) and real-time instrumental observation including 29 stations in Yunnan and 3 stations in Tianjin of China from June to October 2016, predictions are made of the 24-hour hour-by-hour ahead. This study presents an ensemble approach to combine the information of instrumental observation itself and NWP. Use autoregressive-moving-average (ARMA) model to predict future values of the observation time series. Put newest NWP products into the equations derived from the multiple linear regression MOS technique. Handle residual series of MOS outputs with autoregressive (AR) model for the linear property presented in time series. Due to the complexity of non-linear property of atmospheric flow, support vector machine (SVM) is also introduced . Therefore basic data quality control and cross validation makes it able to optimize the model function parameters , and do 24 hours ahead residual reduction with AR/SVM model. Results show that AR model technique is better than corresponding multi-variant MOS regression method especially at the early 4 hours when the predictor is temperature. MOS-AR combined model which is comparable to MOS-SVM model outperform than MOS. Both of their root mean square error and correlation coefficients for 2 m temperature are reduced to 1.6 degree Celsius and 0.91 respectively. The forecast accuracy of 24- hour forecast deviation no more than 2 degree Celsius is 78.75 % for MOS-AR model and 81.23 % for AR model.
Muñoz-Carpena, R; Ritter, A; Li, Y C
2005-11-01
The extensive eastern boundary of Everglades National Park (ENP) in south Florida (USA) is subject to one of the most expensive and ambitious environmental restoration projects in history. Understanding and predicting the water quality interactions between the shallow aquifer and surface water is a key component in meeting current environmental regulations and fine-tuning ENP wetland restoration while still maintaining flood protection for the adjacent developed areas. Dynamic factor analysis (DFA), a recent technique for the study of multivariate non-stationary time-series, was applied to study fluctuations in groundwater quality in the area. More than two years of hydrological and water quality time series (rainfall; water table depth; and soil, ground and surface water concentrations of N-NO3-, N-NH4+, P-PO4(3-), Total P, F-and Cl-) from a small agricultural watershed adjacent to the ENP were selected for the study. The unexplained variability required for determining the concentration of each chemical in the 16 wells was greatly reduced by including in the analysis some of the observed time series as explanatory variables (rainfall, water table depth, and soil and canal water chemical concentration). DFA results showed that groundwater concentration of three of the agrochemical species studied (N-NO3-, P-PO4(3-)and Total P) were affected by the same explanatory variables (water table depth, enriched topsoil, and occurrence of a leaching rainfall event, in order of decreasing relative importance). This indicates that leaching by rainfall is the main mechanism explaining concentration peaks in groundwater. In the case of N-NH4+, in addition to leaching, groundwater concentration is governed by lateral exchange with canals. F-and Cl- are mainly affected by periods of dilution by rainfall recharge, and by exchange with the canals. The unstructured nature of the common trends found suggests that these are related to the complex spatially and temporally varying land use patterns in the watershed. The results indicate that peak concentrations of agrochemicals in groundwater could be reduced by improving fertilization practices (by splitting and modifying timing of applications) and by operating the regional canal system to maintain the water table low, especially during the rainy periods.
NASA Astrophysics Data System (ADS)
Muñoz-Carpena, R.; Ritter, A.; Li, Y. C.
2005-11-01
The extensive eastern boundary of Everglades National Park (ENP) in south Florida (USA) is subject to one of the most expensive and ambitious environmental restoration projects in history. Understanding and predicting the water quality interactions between the shallow aquifer and surface water is a key component in meeting current environmental regulations and fine-tuning ENP wetland restoration while still maintaining flood protection for the adjacent developed areas. Dynamic factor analysis (DFA), a recent technique for the study of multivariate non-stationary time-series, was applied to study fluctuations in groundwater quality in the area. More than two years of hydrological and water quality time series (rainfall; water table depth; and soil, ground and surface water concentrations of N-NO 3-, N-NH 4+, P-PO 43-, Total P, F -and Cl -) from a small agricultural watershed adjacent to the ENP were selected for the study. The unexplained variability required for determining the concentration of each chemical in the 16 wells was greatly reduced by including in the analysis some of the observed time series as explanatory variables (rainfall, water table depth, and soil and canal water chemical concentration). DFA results showed that groundwater concentration of three of the agrochemical species studied (N-NO 3-, P-PO 43-and Total P) were affected by the same explanatory variables (water table depth, enriched topsoil, and occurrence of a leaching rainfall event, in order of decreasing relative importance). This indicates that leaching by rainfall is the main mechanism explaining concentration peaks in groundwater. In the case of N-NH 4+, in addition to leaching, groundwater concentration is governed by lateral exchange with canals. F -and Cl - are mainly affected by periods of dilution by rainfall recharge, and by exchange with the canals. The unstructured nature of the common trends found suggests that these are related to the complex spatially and temporally varying land use patterns in the watershed. The results indicate that peak concentrations of agrochemicals in groundwater could be reduced by improving fertilization practices (by splitting and modifying timing of applications) and by operating the regional canal system to maintain the water table low, especially during the rainy periods.
Applying the multivariate time-rescaling theorem to neural population models
Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon
2011-01-01
Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436
Blind source separation problem in GPS time series
NASA Astrophysics Data System (ADS)
Gualandi, A.; Serpelloni, E.; Belardinelli, M. E.
2016-04-01
A critical point in the analysis of ground displacement time series, as those recorded by space geodetic techniques, is the development of data-driven methods that allow the different sources of deformation to be discerned and characterized in the space and time domains. Multivariate statistic includes several approaches that can be considered as a part of data-driven methods. A widely used technique is the principal component analysis (PCA), which allows us to reduce the dimensionality of the data space while maintaining most of the variance of the dataset explained. However, PCA does not perform well in finding the solution to the so-called blind source separation (BSS) problem, i.e., in recovering and separating the original sources that generate the observed data. This is mainly due to the fact that PCA minimizes the misfit calculated using an L2 norm (χ 2), looking for a new Euclidean space where the projected data are uncorrelated. The independent component analysis (ICA) is a popular technique adopted to approach the BSS problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, we test the use of a modified variational Bayesian ICA (vbICA) method to recover the multiple sources of ground deformation even in the presence of missing data. The vbICA method models the probability density function (pdf) of each source signal using a mix of Gaussian distributions, allowing for more flexibility in the description of the pdf of the sources with respect to standard ICA, and giving a more reliable estimate of them. Here we present its application to synthetic global positioning system (GPS) position time series, generated by simulating deformation near an active fault, including inter-seismic, co-seismic, and post-seismic signals, plus seasonal signals and noise, and an additional time-dependent volcanic source. We evaluate the ability of the PCA and ICA decomposition techniques in explaining the data and in recovering the original (known) sources. Using the same number of components, we find that the vbICA method fits the data almost as well as a PCA method, since the χ 2 increase is less than 10 % the value calculated using a PCA decomposition. Unlike PCA, the vbICA algorithm is found to correctly separate the sources if the correlation of the dataset is low (<0.67) and the geodetic network is sufficiently dense (ten continuous GPS stations within a box of side equal to two times the locking depth of a fault where an earthquake of Mw >6 occurred). We also provide a cookbook for the use of the vbICA algorithm in analyses of position time series for tectonic and non-tectonic applications.
Why didn't Box-Jenkins win (again)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pack, D.J.; Downing, D.J.
This paper focuses on the forecasting performance of the Box-Jenkins methodology applied to the 111 time series of the Makridakis competition. It considers the influence of the following factors: (1) time series length, (2) time-series information (autocorrelation) content, (3) time-series outliers or structural changes, (4) averaging results over time series, and (5) forecast time origin choice. It is found that the 111 time series contain substantial numbers of very short series, series with obvious structural change, and series whose histories are relatively uninformative. If these series are typical of those that one must face in practice, the real message ofmore » the competition is that univariate time series extrapolations will frequently fail regardless of the methodology employed to produce them.« less
Multifractal analysis of visibility graph-based Ito-related connectivity time series.
Czechowski, Zbigniew; Lovallo, Michele; Telesca, Luciano
2016-02-01
In this study, we investigate multifractal properties of connectivity time series resulting from the visibility graph applied to normally distributed time series generated by the Ito equations with multiplicative power-law noise. We show that multifractality of the connectivity time series (i.e., the series of numbers of links outgoing any node) increases with the exponent of the power-law noise. The multifractality of the connectivity time series could be due to the width of connectivity degree distribution that can be related to the exit time of the associated Ito time series. Furthermore, the connectivity time series are characterized by persistence, although the original Ito time series are random; this is due to the procedure of visibility graph that, connecting the values of the time series, generates persistence but destroys most of the nonlinear correlations. Moreover, the visibility graph is sensitive for detecting wide "depressions" in input time series.
Spatial and Temporal Variation in DeSoto Canyon Macrofaunal Community Structure
NASA Astrophysics Data System (ADS)
Baco-Taylor, A.; Shantharam, A. K.
2016-02-01
Sediment-dwelling macrofauna (polychaetes, bivalves, and assorted crustaceans ≥ 300 µm) have long served as biological indicators of ecosystem stress. As part of evaluating the 2010 impact from the Deepwater Horizon blowout, we sampled 12 sites along and transverse to the DeSoto Canyon axis, Gulf of Mexico, as well as 2 control sites outside the Canyon. Sites ranged in depth from 479-2310 m. Three of the sites (PCB06, S36, and XC4) were sampled annually from 2012-2014. We provide an overview of the macrofauna community structure of canyon and non-canyon sites, as well as trends in community structure and diversity at the time-series sites. Compositionally, polychaetes dominated the communities, followed by tanaid crustaceans and bivalves. The total number of individuals was not significantly correlated with depth while the total number of taxa and species richness were. Rarefaction shows the deepest station, XC4 (2310 m) had the lowest diversity while NT800 (a non-canyon control at 800m) had the highest. Multivariate analysis shows the canyon assemblages fall into eight clusters with the non-canyon stations forming a separate ninth cluster, indicating a detectable difference in canyon and non-canyon communities. Time series stations show an increase in diversity from 2012-2014 with a strong overlap in community structure in 2013 and 2014 samples. Environmental analysis, via BEST, using data from 10 canyon sites and the controls, indicated depth in combination with latitude explain the most variation in macrofaunal community structure.
NASA Astrophysics Data System (ADS)
Goodwell, Allison E.; Kumar, Praveen
2017-07-01
Information theoretic measures can be used to identify nonlinear interactions between source and target variables through reductions in uncertainty. In information partitioning, multivariate mutual information is decomposed into synergistic, unique, and redundant components. Synergy is information shared only when sources influence a target together, uniqueness is information only provided by one source, and redundancy is overlapping shared information from multiple sources. While this partitioning has been applied to provide insights into complex dependencies, several proposed partitioning methods overestimate redundant information and omit a component of unique information because they do not account for source dependencies. Additionally, information partitioning has only been applied to time-series data in a limited context, using basic pdf estimation techniques or a Gaussian assumption. We develop a Rescaled Redundancy measure (Rs) to solve the source dependency issue, and present Gaussian, autoregressive, and chaotic test cases to demonstrate its advantages over existing techniques in the presence of noise, various source correlations, and different types of interactions. This study constitutes the first rigorous application of information partitioning to environmental time-series data, and addresses how noise, pdf estimation technique, or source dependencies can influence detected measures. We illustrate how our techniques can unravel the complex nature of forcing and feedback within an ecohydrologic system with an application to 1 min environmental signals of air temperature, relative humidity, and windspeed. The methods presented here are applicable to the study of a broad range of complex systems composed of interacting variables.
Mai, Qun; Aboagye-Sarfo, Patrick; Sanfilippo, Frank M; Preen, David B; Fatovich, Daniel M
2015-02-01
To predict the number of ED presentations in Western Australia (WA) in the next 5 years, stratified by place of treatment, age, triage and disposition. We conducted a population-based time series analysis of 7 year monthly WA statewide ED presentation data from the financial years 2006/07 to 2012/13 using univariate autoregressive integrated moving average (ARIMA) and multivariate vector-ARIMA techniques. ED presentations in WA were predicted to increase from 990,342 in 2012/13 to 1,250,991 (95% CI: 982,265-1,519,718) in 2017/18, an increase of 260,649 (or 26.3%). The majority of this increase would occur in metropolitan WA (84.2%). The compound annual growth rate (CAGR) in metropolitan WA in the next 5 years was predicted to be 6.5% compared with 2.0% in the non-metropolitan area. The greatest growth in metropolitan WA would be in ages 65 and over (CAGR, 6.9%), triage categories 2 and 3 (8.3% and 7.7%, respectively) and admitted (9.8%) cohorts. The only predicted decrease was triage category 5 (-5.3%). ED demand in WA will exceed population growth. The highest growth will be in patients with complex care needs. An integrated system-wide strategy is urgently required to ensure access, quality and sustainability of the health system. © 2015 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
Synchronous Motions Across the Instrumental Climate Record
NASA Astrophysics Data System (ADS)
Carl, Peter
The Earth's climate system bears a rich variety of feedback mechanisms that may give rise to complex, evolving modal structures under internal and external control. Various types of synchronization may be identified in the system's motion when looking at representative time series of the instrumental period through the glasses of an advanced technique of sparse data approximation, the Matching Pursuit (MP) approach. To disentangle the emerging network of oscillatory modes to the degree that climate dynamics turns out to be separable, a large dictionary of "Gaussian logons," i.e. frequency modulated (FM) Gabor atoms, is applied. Though the extracted modes make up linear decompositions, this flexible analyzing signal matches highly nonlinear waveforms. Univariate analyses over the period 1870-1997 are presented of a set of customary time series in annual resolution, comprising global and regional climate, central European synoptic systems, German precipitation, and runoff of the Elbe river near Dresden. All the evidence from this first-generation MP-FM study, obtained in subsequent multivariate syntheses, points to dynamically excited regimes of an organized yet complex climate system under permanent change—perhaps a (pre)chaotic one at centennial timescales, suggesting a "chaos control" perspective on global climate dynamics and change. Findings and conclusions include, among others, internal structure of reconstructed insolation, the episodic nature of global warming as reflected in multidecadal temperature modes, their swarm of "interdomain" companions across the whole system that unveils an unknown regime character of interannual climate dynamics, and the apparent onset early in the 1990s of the present thermal stagnation.
Hazardous indoor CO2 concentrations in volcanic environments.
Viveiros, Fátima; Gaspar, João L; Ferreira, Teresa; Silva, Catarina
2016-07-01
Carbon dioxide is one of the main soil gases released silently and permanently in diffuse degassing areas, both in volcanic and non-volcanic zones. In the volcanic islands of the Azores (Portugal) several villages are located over diffuse degassing areas. Lethal indoor CO2 concentrations (higher than 10 vol %) were measured in a shelter located at Furnas village, inside the caldera of the quiescent Furnas Volcano (S. Miguel Island). Hazardous CO2 concentrations were detected not only underground, but also at the ground floor level. Multivariate regression analysis was applied to the CO2 and environmental time series recorded between April 2008 and March 2010 at Furnas village. The results show that about 30% of the indoor CO2 variation is explained by environmental variables, namely barometric pressure, soil water content and wind speed. The highest indoor CO2 concentrations were recorded during bad weather conditions, characterized by low barometric pressure together with rainfall periods and high wind speed. In addition to the spike-like changes observed on the CO2 time series, long-term oscillations were also identified and appeared to represent seasonal variations. In fact, indoor CO2 concentrations were higher during winter period when compared to the dry summer months. Considering the permanent emission of CO2 in various volcanic regions of the world, CO2 hazard maps are crucial and need to be accounted by the land-use planners and authorities. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Jaumann, Peter Josef
1995-01-01
Estimates of past natural climatic variability on long time scales (centuries to millennia) are crucial in testing climate models. The process of model validation takes advantage of long general circulation model (GCM) integrations, instrumental and satellite observations, and paleoclimatic records. Here I use paleoclimatic proxy records from central North America spanning the last 150 ka to characterize climatic variability on sub-orbital time scales. A terrestrial last interglacial (~ 130 to 75 kyr BP) pollen sequence from south-central Illinois, U.S.A., contains climatic variance in frequency bands between 1 cycle/10 kyr and 1 cycle/1 kyr. The temporal variance is best developed as alternating cycles of pollen assemblages indicative of wet and dry conditions. Spectral cross-correlations between selected pollen types and potential forcings (ETP (eccentricity, tilt, precession), SPECMAP delta^{18}O) implicate oceanic and solar processes as possible mechanisms driving last interglacial vegetation and climate change in the Midwestern U.S. During the last glacial stage (LGS; 20 to 16 kyr BP) a lacustrine sequence from the central Mississippi River valley experienced major flooding events caused by intermittent melting of the Laurentide ice sheet. Rock -magnetic and grain size data confirm the physical record of flood clays. Correlation of the flood clays to the Greenland (GRIP) ice core is weak. However, the Laurentide melting events seem to fall temporally between the releases of minor LGS iceberg discharges into the North Atlantic. The GRIP delta^{18}O and the Midwestern U.S. magnetic susceptibility time series indicate sub-Milankovitch climate variability modes. Mapping, multivariate, and time series analyses of Holocene (8 to 1 ka) pollen sequences from central North America suggest spatial patterns of vegetation and climate change on sub-orbital to millennial time scales. The rate, magnitude, and spatial patterns of change varied considerably over the study region. Major climatic variance contained in several well-dated pollen time series ranges between 1 cycle/6 kyr and 1 cycle/0.6 kyr. Singular and cross -spectral analyses, again, suggest solar and oceanic forcing. Although it is difficult to attribute past climatic changes to specific forcings, the geologic record of past global change will prove invaluable in the assessment of long-term future climate change and prediction.
A Multivariate Model for the Meta-Analysis of Study Level Survival Data at Multiple Times
ERIC Educational Resources Information Center
Jackson, Dan; Rollins, Katie; Coughlin, Patrick
2014-01-01
Motivated by our meta-analytic dataset involving survival rates after treatment for critical leg ischemia, we develop and apply a new multivariate model for the meta-analysis of study level survival data at multiple times. Our data set involves 50 studies that provide mortality rates at up to seven time points, which we model simultaneously, and…
ERIC Educational Resources Information Center
Wilson, Mark
This study investigates the accuracy of the Woodruff-Causey technique for estimating sampling errors for complex statistics. The technique may be applied when data are collected by using multistage clustered samples. The technique was chosen for study because of its relevance to the correct use of multivariate analyses in educational survey…
Regenerating time series from ordinal networks.
McCullough, Michael; Sakellariou, Konstantinos; Stemler, Thomas; Small, Michael
2017-03-01
Recently proposed ordinal networks not only afford novel methods of nonlinear time series analysis but also constitute stochastic approximations of the deterministic flow time series from which the network models are constructed. In this paper, we construct ordinal networks from discrete sampled continuous chaotic time series and then regenerate new time series by taking random walks on the ordinal network. We then investigate the extent to which the dynamics of the original time series are encoded in the ordinal networks and retained through the process of regenerating new time series by using several distinct quantitative approaches. First, we use recurrence quantification analysis on traditional recurrence plots and order recurrence plots to compare the temporal structure of the original time series with random walk surrogate time series. Second, we estimate the largest Lyapunov exponent from the original time series and investigate the extent to which this invariant measure can be estimated from the surrogate time series. Finally, estimates of correlation dimension are computed to compare the topological properties of the original and surrogate time series dynamics. Our findings show that ordinal networks constructed from univariate time series data constitute stochastic models which approximate important dynamical properties of the original systems.
Regenerating time series from ordinal networks
NASA Astrophysics Data System (ADS)
McCullough, Michael; Sakellariou, Konstantinos; Stemler, Thomas; Small, Michael
2017-03-01
Recently proposed ordinal networks not only afford novel methods of nonlinear time series analysis but also constitute stochastic approximations of the deterministic flow time series from which the network models are constructed. In this paper, we construct ordinal networks from discrete sampled continuous chaotic time series and then regenerate new time series by taking random walks on the ordinal network. We then investigate the extent to which the dynamics of the original time series are encoded in the ordinal networks and retained through the process of regenerating new time series by using several distinct quantitative approaches. First, we use recurrence quantification analysis on traditional recurrence plots and order recurrence plots to compare the temporal structure of the original time series with random walk surrogate time series. Second, we estimate the largest Lyapunov exponent from the original time series and investigate the extent to which this invariant measure can be estimated from the surrogate time series. Finally, estimates of correlation dimension are computed to compare the topological properties of the original and surrogate time series dynamics. Our findings show that ordinal networks constructed from univariate time series data constitute stochastic models which approximate important dynamical properties of the original systems.
Perioperative surgical outcome of conventional and robot-assisted total laparoscopic hysterectomy.
van Weelden, W J; Gordon, B B M; Roovers, E A; Kraayenbrink, A A; Aalders, C I M; Hartog, F; Dijkhuizen, F P H L J
2017-01-01
To evaluate surgical outcome in a consecutive series of patients with conventional and robot assisted total laparoscopic hysterectomy. A retrospective cohort study was performed among patients with benign and malignant indications for a laparoscopic hysterectomy. Main surgical outcomes were operation room time and skin to skin operating time, complications, conversions, rehospitalisation and reoperation, estimated blood loss and length of hospital stay. A total of 294 patients were evaluated: 123 in the conventional total laparoscopic hysterectomy (TLH) group and 171 in the robot TLH group. After correction for differences in basic demographics with a multivariate linear regression analysis, the skin to skin operating time was a significant 18 minutes shorter in robot assisted TLH compared to conventional TLH (robot assisted TLH 92m, conventional TLH 110m, p0.001). The presence or absence of previous abdominal surgery had a significant influence on the skin to skin operating time as did the body mass index and the weight of the uterus. Complications were not significantly different. The robot TLH group had significantly less blood loss and lower rehospitalisation and reoperation rates. This study compares conventional TLH with robot assisted TLH and shows shorter operating times, less blood loss and lower rehospitalisation and reoperation rates in the robot TLH group.
Ulnar osteosarcoma in dogs: 30 cases (1992-2008).
Sivacolundhu, Ramesh K; Runge, Jeffrey J; Donovan, Taryn A; Barber, Lisa G; Saba, Corey F; Clifford, Craig A; de Lorimier, Louis-Philippe; Atwater, Stephen W; DiBernardi, Lisa; Freeman, Kim P; Bergman, Philip J
2013-07-01
To examine the biological behavior of ulnar osteosarcoma and evaluate predictors of survival time in dogs. Retrospective case series. 30 dogs with primary ulnar osteosarcoma. Medical records were reviewed. Variables recorded and examined to identify predictors of survival time were signalment, tumor location in the ulna, tumor length, serum alkaline phosphatase activity, surgery type, completeness of excision, tumor stage, tumor grade, histologic subtype, development of metastases, and use of chemotherapy. 30 cases were identified from 9 institutions. Eleven dogs were treated with partial ulnar ostectomy and 14 with amputation; in 5 dogs, a resection was not performed. Twenty-two dogs received chemotherapy. Median disease-free interval and survival time were 437 and 463 days, respectively. Negative prognostic factors for survival time determined via univariate analyses were histologic subtype and development of lung metastases. Telangiectatic or telangiectatic-mixed subtype (n = 5) was the only negative prognostic factor identified via multivariate analysis (median survival time, 208 days). Dogs with telangiectatic subtype were 6.99 times as likely to die of the disease. The prognosis for ulnar osteosarcoma in this population was no worse and may have been better than the prognosis for dogs with osteosarcoma involving other appendicular sites. Partial ulnar ostectomy was associated with a low complication rate and good to excellent function and did not compromise survival time. Telangiectatic or telangiectatic-mixed histologic subtype was a negative prognostic factor for survival time. The efficacy of chemotherapy requires further evaluation.
GPS Position Time Series @ JPL
NASA Technical Reports Server (NTRS)
Owen, Susan; Moore, Angelyn; Kedar, Sharon; Liu, Zhen; Webb, Frank; Heflin, Mike; Desai, Shailen
2013-01-01
Different flavors of GPS time series analysis at JPL - Use same GPS Precise Point Positioning Analysis raw time series - Variations in time series analysis/post-processing driven by different users. center dot JPL Global Time Series/Velocities - researchers studying reference frame, combining with VLBI/SLR/DORIS center dot JPL/SOPAC Combined Time Series/Velocities - crustal deformation for tectonic, volcanic, ground water studies center dot ARIA Time Series/Coseismic Data Products - Hazard monitoring and response focused center dot ARIA data system designed to integrate GPS and InSAR - GPS tropospheric delay used for correcting InSAR - Caltech's GIANT time series analysis uses GPS to correct orbital errors in InSAR - Zhen Liu's talking tomorrow on InSAR Time Series analysis
Prentice, Ross L; Zhao, Shanshan
2018-01-01
The Dabrowska (Ann Stat 16:1475-1489, 1988) product integral representation of the multivariate survivor function is extended, leading to a nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation. Empirical process methods are used to sketch proofs for this estimator's strong consistency and weak convergence properties. Summary measures of pairwise and higher-order dependencies are also defined and nonparametrically estimated. Simulation evaluation is given for the special case of three failure time variates.
Employment and Socioeconomic Factors Associated With Children's Up-to-Date Vaccination Status.
Chen, Weiwei; Elam-Evans, Laurie D; Hill, Holly A; Yankey, David
2017-04-01
This study examined whether additional information on parents' employment and household characteristics would help explain the differences in children's up-to-date (UTD) vaccination status using the 2008 National Immunization Survey and its associated Socioeconomic Status Module. After controlling for basic sociodemographic factors in multivariable analyses, parent's work schedules and ease of taking time off from work were not associated with UTD vaccination status among 19- to 35-month-old children. We also conducted a stratified analysis to test the heterogeneous effects of the factors among children at 3 age-restricted maternal education levels and found the benefit of paid sick leave had a significant association only among families where the mother had a college degree. Families who had moved since the child's birth, especially if the mother had high school or lower education, were less likely to have children UTD on the vaccine series.
Temporal variation in pelagic food chain length in response to environmental change
Ruiz-Cooley, Rocio I.; Gerrodette, Tim; Fiedler, Paul C.; Chivers, Susan J.; Danil, Kerri; Ballance, Lisa T.
2017-01-01
Climate variability alters nitrogen cycling, primary productivity, and dissolved oxygen concentration in marine ecosystems. We examined the role of this variability (as measured by six variables) on food chain length (FCL) in the California Current (CC) by reconstructing a time series of amino acid–specific δ15N values derived from common dolphins, an apex pelagic predator, and using two FCL proxies. Strong declines in FCL were observed after the 1997–1999 El Niño Southern Oscillation (ENSO) event. Bayesian models revealed longer FCLs under intermediate conditions for surface temperature, chlorophyll concentration, multivariate ENSO index, and total plankton volume but not for hypoxic depth and nitrate concentration. Our results challenge the prevalent paradigm that suggested long-term stability in the food web structure in the CC and, instead, reveal that pelagic food webs respond strongly to disturbances associated with ENSO events, local oceanography, and ongoing changes in climate. PMID:29057322
Study on connectivity between coherent central rhythm and electromyographic activities
NASA Astrophysics Data System (ADS)
Meng, Fei; Tong, Kai-yu; Chan, Suk-tak; Wong, Wan-wa; Lui, Ka-him; Tang, Kwok-wing; Gao, Xiaorong; Gao, Shangkai
2008-09-01
Whether afferent feedback contributes to the generation of cortico-muscular coherence (CMCoh) remains an open question. In the present study, a multivariate autoregressive (MVAR) model and partial directed coherence (PDC) were applied to investigate the causal influences between the central rhythm and electromyographic (EMG) signals in the process of CMCoh. The system modeling included activities from the contralateral and ipsilateral primary sensorimotor cortex (M1/S1), supplementary motor area (SMA) and the time series from extensor carpi radialis (ECR) muscles. The results showed that afferent sensory feedback could also play an important role for the generation of CMCoh. Meanwhile, significant coherence between the EMG signals and the activities in the SMA was found in two subjects out of five. Connectivity analysis revealed a significant descending information flow which possibly reflected direct recruitment on the motoneurons from the SMA to facilitate motor control.
Process analytical technology in the pharmaceutical industry: a toolkit for continuous improvement.
Scott, Bradley; Wilcock, Anne
2006-01-01
Process analytical technology (PAT) refers to a series of tools used to ensure that quality is built into products while at the same time improving the understanding of processes, increasing efficiency, and decreasing costs. It has not been widely adopted by the pharmaceutical industry. As the setting for this paper, the current pharmaceutical manufacturing paradigm and PAT guidance to date are discussed prior to the review of PAT principles and tools, benefits, and challenges. The PAT toolkit contains process analyzers, multivariate analysis tools, process control tools, and continuous improvement/knowledge management/information technology systems. The integration and implementation of these tools is complex, and has resulted in uncertainty with respect to both regulation and validation. The paucity of staff knowledgeable in this area may complicate adoption. Studies to quantitate the benefits resulting from the adoption of PAT within the pharmaceutical industry would be a valuable addition to the qualitative studies that are currently available.
Geospatial Resource Access Analysis In Hedaru, Tanzania
NASA Astrophysics Data System (ADS)
Clark, Dylan G.; Premkumar, Deepak; Mazur, Robert; Kisimbo, Elibariki
2013-12-01
Populations around the world are facing increased impacts of anthropogenic-induced environmental changes and rapid population movements. These environmental and social shifts are having an elevated impact on the livelihoods of agriculturalists and pastoralists in developing countries. This appraisal integrates various tools—usually used independently— to gain a comprehensive understanding of the regional livelihood constraints in the rural Hedaru Valley of northeastern Tanzania. Conducted in three villages with different natural resources, using three primary methods: 1) participatory mapping of infrastructures; 2) administration of quantitative, spatially-tied surveys (n=80) and focus groups (n=14) that examined land use, household health, education, and demographics; 3) conducting quantitative time series analysis of Landsat- based Normalized Difference Vegetation Index images. Through various geospatial and multivariate linear regression analyses, significant geospatial trends emerged. This research added to the academic understanding of the region while establishing pathways for climate change adaptation strategies.
Feinberg, Mark E; Kim, Ji-Yeon; Greenberg, Mark T
2008-11-01
The predictors and correlates of positive functioning among community prevention teams have been examined in a number of research studies; however, the role of personality has been neglected. In this study, we examined whether team member and leader personality dimensions assessed at the time of team formation predicted local prevention team functioning 2.5-3.5 years later. Participants were 159 prevention team members in 14 communities participating in the PROSPER study of prevention program dissemination. Three aspects of personality, aggregated at the team level, were examined as predictors: Openness to Experience, Conscientiousness, and Agreeableness. A series of multivariate regression analyses were performed that accounted for the interdependency of five categories of team functioning. Results showed that average team member Openness was negatively, and Conscientiousness was positively linked to team functioning. The findings have implications for decisions about the level and nature of technical assistance support provided to community prevention teams.
Trends in the capture fisheries in Cuyo East Pass, Philippines
San Diego, Tee-Jay A.; Fisher, William L.
2014-01-01
Findings are presented of a comprehensive analysis of time series catch and effort data from 2000 to 2006 collected from a multi-species, multi-gear and two-sector (municipal and commercial) capture fisheries in Cuyo East Pass, Philippines. Multivariate techniques were used to determine temporal variation in species composition and gear selectivity that corresponded with annual trends in catch and effort. Distinct annual variation in species composition was found for five fisheries classified according to sector-gear combination, corresponding decline in catch diversity, noted shifts in gears used, and an erratic CPUE trend as a result of catch variation. These patterns and trends illustrate the occurrence of ecosystem overfishing for Cuyo East Pass. Our approach provided a holistic representation of the fishing situation, condition of the fisheries and corresponding implications to the ecosystem, fitting well within the context of the ecosystem approach to fisheries management.
Bayes linear covariance matrix adjustment
NASA Astrophysics Data System (ADS)
Wilkinson, Darren J.
1995-12-01
In this thesis, a Bayes linear methodology for the adjustment of covariance matrices is presented and discussed. A geometric framework for quantifying uncertainties about covariance matrices is set up, and an inner-product for spaces of random matrices is motivated and constructed. The inner-product on this space captures aspects of our beliefs about the relationship between covariance matrices of interest to us, providing a structure rich enough for us to adjust beliefs about unknown matrices in the light of data such as sample covariance matrices, exploiting second-order exchangeability and related specifications to obtain representations allowing analysis. Adjustment is associated with orthogonal projection, and illustrated with examples of adjustments for some common problems. The problem of adjusting the covariance matrices underlying exchangeable random vectors is tackled and discussed. Learning about the covariance matrices associated with multivariate time series dynamic linear models is shown to be amenable to a similar approach. Diagnostics for matrix adjustments are also discussed.
Xiao, H; Gao, L D; Li, X J; Lin, X L; Dai, X Y; Zhu, P J; Chen, B Y; Zhang, X X; Zhao, J; Tian, H Y
2013-09-01
The transmission of haemorrhagic fever with renal syndrome (HFRS) is influenced by climatic, reservoir and environmental variables. The epidemiology of the disease was studied over a 6-year period in Changsha. Variables relating to climate, environment, rodent host distribution and disease occurrence were collected monthly and analysed using a time-series adjusted Poisson regression model. It was found that the density of the rodent host and multivariate El Niño Southern Oscillation index had the greatest effect on the transmission of HFRS with lags of 2–6 months. However, a number of climatic and environmental factors played important roles in affecting the density and transmission potential of the rodent host population. It was concluded that the measurement of a number of these variables could be used in disease surveillance to give useful advance warning of potential disease epidemics.
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Lin; Gupta, Hoshin V.; Gao, Xiaogang; Sorooshian, Soroosh; Imam, Bisher
2002-12-01
Artificial neural networks (ANNs) can be useful in the prediction of hydrologic variables, such as streamflow, particularly when the underlying processes have complex nonlinear interrelationships. However, conventional ANN structures suffer from network training issues that significantly limit their widespread application. This paper presents a multivariate ANN procedure entitled self-organizing linear output map (SOLO), whose structure has been designed for rapid, precise, and inexpensive estimation of network structure/parameters and system outputs. More important, SOLO provides features that facilitate insight into the underlying processes, thereby extending its usefulness beyond forecast applications as a tool for scientific investigations. These characteristics are demonstrated using a classic rainfall-runoff forecasting problem. Various aspects of model performance are evaluated in comparison with other commonly used modeling approaches, including multilayer feedforward ANNs, linear time series modeling, and conceptual rainfall-runoff modeling.
Explaining cross-national differences in marriage, cohabitation, and divorce in Europe, 1990-2000.
Kalmijn, Matthijs
2007-11-01
European countries differ considerably in their marriage patterns. The study presented in this paper describes these differences for the 1990s and attempts to explain them from a macro-level perspective. We find that different indicators of marriage (i.e., marriage rate, age at marriage, divorce rate, and prevalence of unmarried cohabitation) cannot be seen as indicators of an underlying concept such as the 'strength of marriage'. Multivariate ordinary least squares (OLS) regression analyses are estimated with countries as units and panel regression models are estimated in which annual time series for multiple countries are pooled. Using these models, we find that popular explanations of trends in the indicators - explanations that focus on gender roles, secularization, unemployment, and educational expansion - are also important for understanding differences among countries. We also find evidence for the role of historical continuity and societal disintegration in understanding cross-national differences.
Lara, Juan A; Lizcano, David; Pérez, Aurora; Valente, Juan P
2014-10-01
There are now domains where information is recorded over a period of time, leading to sequences of data known as time series. In many domains, like medicine, time series analysis requires to focus on certain regions of interest, known as events, rather than analyzing the whole time series. In this paper, we propose a framework for knowledge discovery in both one-dimensional and multidimensional time series containing events. We show how our approach can be used to classify medical time series by means of a process that identifies events in time series, generates time series reference models of representative events and compares two time series by analyzing the events they have in common. We have applied our framework on time series generated in the areas of electroencephalography (EEG) and stabilometry. Framework performance was evaluated in terms of classification accuracy, and the results confirmed that the proposed schema has potential for classifying EEG and stabilometric signals. The proposed framework is useful for discovering knowledge from medical time series containing events, such as stabilometric and electroencephalographic time series. These results would be equally applicable to other medical domains generating iconographic time series, such as, for example, electrocardiography (ECG). Copyright © 2014 Elsevier Inc. All rights reserved.
A Review of Calibration Transfer Practices and Instrument Differences in Spectroscopy.
Workman, Jerome J
2018-03-01
Calibration transfer for use with spectroscopic instruments, particularly for near-infrared, infrared, and Raman analysis, has been the subject of multiple articles, research papers, book chapters, and technical reviews. There has been a myriad of approaches published and claims made for resolving the problems associated with transferring calibrations; however, the capability of attaining identical results over time from two or more instruments using an identical calibration still eludes technologists. Calibration transfer, in a precise definition, refers to a series of analytical approaches or chemometric techniques used to attempt to apply a single spectral database, and the calibration model developed using that database, for two or more instruments, with statistically retained accuracy and precision. Ideally, one would develop a single calibration for any particular application, and move it indiscriminately across instruments and achieve identical analysis or prediction results. There are many technical aspects involved in such precision calibration transfer, related to the measuring instrument reproducibility and repeatability, the reference chemical values used for the calibration, the multivariate mathematics used for calibration, and sample presentation repeatability and reproducibility. Ideally, a multivariate model developed on a single instrument would provide a statistically identical analysis when used on other instruments following transfer. This paper reviews common calibration transfer techniques, mostly related to instrument differences, and the mathematics of the uncertainty between instruments when making spectroscopic measurements of identical samples. It does not specifically address calibration maintenance or reference laboratory differences.
Brier, Matthew R; Mitra, Anish; McCarthy, John E; Ances, Beau M; Snyder, Abraham Z
2015-11-01
Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. Copyright © 2015 Elsevier Inc. All rights reserved.
Effective network inference through multivariate information transfer estimation
NASA Astrophysics Data System (ADS)
Dahlqvist, Carl-Henrik; Gnabo, Jean-Yves
2018-06-01
Network representation has steadily gained in popularity over the past decades. In many disciplines such as finance, genetics, neuroscience or human travel to cite a few, the network may not directly be observable and needs to be inferred from time-series data, leading to the issue of separating direct interactions between two entities forming the network from indirect interactions coming through its remaining part. Drawing on recent contributions proposing strategies to deal with this problem such as the so-called "global silencing" approach of Barzel and Barabasi or "network deconvolution" of Feizi et al. (2013), we propose a novel methodology to infer an effective network structure from multivariate conditional information transfers. Its core principal is to test the information transfer between two nodes through a step-wise approach by conditioning the transfer for each pair on a specific set of relevant nodes as identified by our algorithm from the rest of the network. The methodology is model free and can be applied to high-dimensional networks with both inter-lag and intra-lag relationships. It outperforms state-of-the-art approaches for eliminating the redundancies and more generally retrieving simulated artificial networks in our Monte-Carlo experiments. We apply the method to stock market data at different frequencies (15 min, 1 h, 1 day) to retrieve the network of US largest financial institutions and then document how bank's centrality measurements relate to bank's systemic vulnerability.
Brier, Matthew R.; Mitra, Anish; McCarthy, John E.; Ances, Beau M.; Snyder, Abraham Z.
2015-01-01
Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. PMID:26208872
Climate variability, weather and enteric disease incidence in New Zealand: time series analysis.
Lal, Aparna; Ikeda, Takayoshi; French, Nigel; Baker, Michael G; Hales, Simon
2013-01-01
Evaluating the influence of climate variability on enteric disease incidence may improve our ability to predict how climate change may affect these diseases. To examine the associations between regional climate variability and enteric disease incidence in New Zealand. Associations between monthly climate and enteric diseases (campylobacteriosis, salmonellosis, cryptosporidiosis, giardiasis) were investigated using Seasonal Auto Regressive Integrated Moving Average (SARIMA) models. No climatic factors were significantly associated with campylobacteriosis and giardiasis, with similar predictive power for univariate and multivariate models. Cryptosporidiosis was positively associated with average temperature of the previous month (β = 0.130, SE = 0.060, p <0.01) and inversely related to the Southern Oscillation Index (SOI) two months previously (β = -0.008, SE = 0.004, p <0.05). By contrast, salmonellosis was positively associated with temperature (β = 0.110, SE = 0.020, p<0.001) of the current month and SOI of the current (β = 0.005, SE = 0.002, p<0.050) and previous month (β = 0.005, SE = 0.002, p<0.05). Forecasting accuracy of the multivariate models for cryptosporidiosis and salmonellosis were significantly higher. Although spatial heterogeneity in the observed patterns could not be assessed, these results suggest that temporally lagged relationships between climate variables and national communicable disease incidence data can contribute to disease prediction models and early warning systems.
Coburger, Jan; Wirtz, Christian R; König, Ralph W
2017-06-01
In patients with a glioblastoma (GBM), few unselected data exists using actual standard adjuvant treatment and contemporary surgical techniques like iMRI. Aim of study is to assess impact of EoR and recurrent surgery on survival and outcome. We assessed a consecutive unselected series of 170 surgeries for GBM (2008-2014) applying intraoperative MRI (iMRI). All patients received adjuvant radio-chemo-therapy. Overall-survival (OS), progression free survival (PFS), complications and new permanent neurological deficits (nPND) were assessed. Uni- and multivariate-cox-regression-models were calculated. Mean follow-up was 40mo. GTR was intended in 82% and achieved in 77% of these cases. A nPND was found in 7% of patients. In multivariate cox-regression, GTR (HR:0.6, P<0.024) and absence of MGMT methylation (HR:1.6, P<0.042) was significantly associated with PFS. We found no difference in PFS after primary surgery and recurrent surgery. Concerning OS, in multivariate assessment an un-methylated MGMT-promotor (HR2.0, P<0.01) and presence of a complication (HR1.7, P<0.06) were negative prognosticators. Only GTR was significantly beneficial for OS (HR0.4, P<0.028) compared to a failed GTR and a STR. Repeated surgery for recurrent disease was positively associated with OS (HR0.6, P<0.06). Surgery in a contemporary setup using iMRI, brain mapping and modern adjuvant treatment, has a higher OS and lower complication rates as previously published. A maximum but safe resection should be the goal of surgery since a perioperative complication significantly decreases OS. Recurrent surgery has a beneficial effect on OS without an increase of complications.
NASA Astrophysics Data System (ADS)
Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik
2016-04-01
Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.
Climate, Water and Renewable Energy in the Nordic Countries
NASA Astrophysics Data System (ADS)
Snorrason, A.; Jonsdottir, J. F.
2004-05-01
Climate and Energy (CE) is a new Nordic research project with funding from Nordic Energy Research (NEFP) and the Nordic energy sector. The project has the objective of a comprehensive assessment of the impact of climate variability and change on Nordic renewable energy resources including hydropower, wind power, bio-fuels and solar energy. This will include assessment of the power production of the hydropower dominated Nordic energy system and its sensitivity and vulnerability to climate change on both temporal and spatial scales; assessment of the impacts of extremes including floods, droughts, storms, seasonal patterns and variability. Within the CE project several thematic groups work on specific issues of climatic change and their impacts on renewable energy. A primary aim of the CE climate group is to supply a standard set of common scenarios of climate change in northern Europe and Greenland, based on recent global and regional climate change experiments. The snow and ice group has chosen glaciers from Greenland, Iceland, Norway and Sweden for an analysis of the response of glaciers to climate changes. Mass balance and dynamical changes, corresponding to the common scenario for climate changes, will be modelled and effects on glacier hydrology will be estimated. Preliminary work with dynamic modelling and climate scenarios shows a dramatic response of glacial runoff to increased temperature and precipitation. The statistical analysis group has reported on the status of time series analysis in the Nordic countries. The group has selected and quality controlled time series of stream flow to be included in the Nordic component of the database FRIEND. Also the group will collect information on time series for other variables and these series will be systematically analysed with respect to trend and other long-term changes. Preliminary work using multivariate analysis on stream flow and climate variables shows strong linkages with the long term atmospheric circulation in the North Atlantic. The hydrological modelling group has already reported on "Climate change impacts on water resources in the Nordic countries - State of the art and discussion of principles". The group will compare different approaches of transferring the climate change signal into hydrological models and discuss uncertainties in models and climate scenarios. Furthermore, comprehensive assessment and mapping of impact of climate change will be produced for the whole Nordic region based on the scenarios from the CE-climate group.
Silva, Anabela G; Sa-Couto, Pedro; Queirós, Alexandra; Neto, Maritza; Rocha, Nelson P
2017-05-16
Studies exploring the association between physical activity, screen time and sleep and pain usually focus on a limited number of painful body sites. Nevertheless, pain at different body sites is likely to be of different nature. Therefore, this study aims to explore and compare the association between time spent in self-reported physical activity, in screen based activities and sleeping and i) pain presence in the last 7-days for 9 different body sites; ii) pain intensity at 9 different body sites and iii) global disability. Nine hundred sixty nine students completed a questionnaire on pain, time spent in moderate and vigorous physical activity, screen based time watching TV/DVD, playing, using mobile phones and computers and sleeping hours. Univariate and multivariate associations between pain presence, pain intensity and disability and physical activity, screen based time and sleeping hours were investigated. Pain presence: sleeping remained in the multivariable model for the neck, mid back, wrists, knees and ankles/feet (OR 1.17 to 2.11); moderate physical activity remained in the multivariate model for the neck, shoulders, wrists, hips and ankles/feet (OR 1.06 to 1.08); vigorous physical activity remained in the multivariate model for mid back, knees and ankles/feet (OR 1.05 to 1.09) and screen time remained in the multivariate model for the low back (OR = 2.34. Pain intensity: screen time and moderate physical activity remained in the multivariable model for pain intensity at the neck, mid back, low back, shoulder, knees and ankles/feet (Rp 2 0.02 to 0.04) and at the wrists (Rp 2 = 0.04), respectively. Disability showed no association with sleeping, screen time or physical activity. This study suggests both similarities and differences in the patterns of association between time spent in physical activity, sleeping and in screen based activities and pain presence at 8 different body sites. In addition, they also suggest that the factors associated with the presence of pain, pain intensity and pain associated disability are different.
Extensions to Multivariate Space Time Mixture Modeling of Small Area Cancer Data.
Carroll, Rachel; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Aregay, Mehreteab; Watjou, Kevin
2017-05-09
Oral cavity and pharynx cancer, even when considered together, is a fairly rare disease. Implementation of multivariate modeling with lung and bronchus cancer, as well as melanoma cancer of the skin, could lead to better inference for oral cavity and pharynx cancer. The multivariate structure of these models is accomplished via the use of shared random effects, as well as other multivariate prior distributions. The results in this paper indicate that care should be taken when executing these types of models, and that multivariate mixture models may not always be the ideal option, depending on the data of interest.
Detection of a sudden change of the field time series based on the Lorenz system.
Da, ChaoJiu; Li, Fang; Shen, BingLu; Yan, PengCheng; Song, Jian; Ma, DeShan
2017-01-01
We conducted an exploratory study of the detection of a sudden change of the field time series based on the numerical solution of the Lorenz system. First, the time when the Lorenz path jumped between the regions on the left and right of the equilibrium point of the Lorenz system was quantitatively marked and the sudden change time of the Lorenz system was obtained. Second, the numerical solution of the Lorenz system was regarded as a vector; thus, this solution could be considered as a vector time series. We transformed the vector time series into a time series using the vector inner product, considering the geometric and topological features of the Lorenz system path. Third, the sudden change of the resulting time series was detected using the sliding t-test method. Comparing the test results with the quantitatively marked time indicated that the method could detect every sudden change of the Lorenz path, thus the method is effective. Finally, we used the method to detect the sudden change of the pressure field time series and temperature field time series, and obtained good results for both series, which indicates that the method can apply to high-dimension vector time series. Mathematically, there is no essential difference between the field time series and vector time series; thus, we provide a new method for the detection of the sudden change of the field time series.
Volatility of linear and nonlinear time series
NASA Astrophysics Data System (ADS)
Kalisky, Tomer; Ashkenazy, Yosef; Havlin, Shlomo
2005-07-01
Previous studies indicated that nonlinear properties of Gaussian distributed time series with long-range correlations, ui , can be detected and quantified by studying the correlations in the magnitude series ∣ui∣ , the “volatility.” However, the origin for this empirical observation still remains unclear and the exact relation between the correlations in ui and the correlations in ∣ui∣ is still unknown. Here we develop analytical relations between the scaling exponent of linear series ui and its magnitude series ∣ui∣ . Moreover, we find that nonlinear time series exhibit stronger (or the same) correlations in the magnitude time series compared with linear time series with the same two-point correlations. Based on these results we propose a simple model that generates multifractal time series by explicitly inserting long range correlations in the magnitude series; the nonlinear multifractal time series is generated by multiplying a long-range correlated time series (that represents the magnitude series) with uncorrelated time series [that represents the sign series sgn(ui) ]. We apply our techniques on daily deep ocean temperature records from the equatorial Pacific, the region of the El-Ninõ phenomenon, and find: (i) long-range correlations from several days to several years with 1/f power spectrum, (ii) significant nonlinear behavior as expressed by long-range correlations of the volatility series, and (iii) broad multifractal spectrum.
Dienstmann, R; Mason, M J; Sinicrope, F A; Phipps, A I; Tejpar, S; Nesbakken, A; Danielsen, S A; Sveen, A; Buchanan, D D; Clendenning, M; Rosty, C; Bot, B; Alberts, S R; Milburn Jessup, J; Lothe, R A; Delorenzi, M; Newcomb, P A; Sargent, D; Guinney, J
2017-05-01
TNM staging alone does not accurately predict outcome in colon cancer (CC) patients who may be eligible for adjuvant chemotherapy. It is unknown to what extent the molecular markers microsatellite instability (MSI) and mutations in BRAF or KRAS improve prognostic estimation in multivariable models that include detailed clinicopathological annotation. After imputation of missing at random data, a subset of patients accrued in phase 3 trials with adjuvant chemotherapy (n = 3016)-N0147 (NCT00079274) and PETACC3 (NCT00026273)-was aggregated to construct multivariable Cox models for 5-year overall survival that were subsequently validated internally in the remaining clinical trial samples (n = 1499), and also externally in different population cohorts of chemotherapy-treated (n = 949) or -untreated (n = 1080) CC patients, and an additional series without treatment annotation (n = 782). TNM staging, MSI and BRAFV600E mutation status remained independent prognostic factors in multivariable models across clinical trials cohorts and observational studies. Concordance indices increased from 0.61-0.68 in the TNM alone model to 0.63-0.71 in models with added molecular markers, 0.65-0.73 with clinicopathological features and 0.66-0.74 with all covariates. In validation cohorts with complete annotation, the integrated time-dependent AUC rose from 0.64 for the TNM alone model to 0.67 for models that included clinicopathological features, with or without molecular markers. In patient cohorts that received adjuvant chemotherapy, the relative proportion of variance explained (R2) by TNM, clinicopathological features and molecular markers was on an average 65%, 25% and 10%, respectively. Incorporation of MSI, BRAFV600E and KRAS mutation status to overall survival models with TNM staging improves the ability to precisely prognosticate in stage II and III CC patients, but only modestly increases prediction accuracy in multivariable models that include clinicopathological features, particularly in chemotherapy-treated patients. © The Author 2017. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Duality between Time Series and Networks
Campanharo, Andriana S. L. O.; Sirer, M. Irmak; Malmgren, R. Dean; Ramos, Fernando M.; Amaral, Luís A. Nunes.
2011-01-01
Studying the interaction between a system's components and the temporal evolution of the system are two common ways to uncover and characterize its internal workings. Recently, several maps from a time series to a network have been proposed with the intent of using network metrics to characterize time series. Although these maps demonstrate that different time series result in networks with distinct topological properties, it remains unclear how these topological properties relate to the original time series. Here, we propose a map from a time series to a network with an approximate inverse operation, making it possible to use network statistics to characterize time series and time series statistics to characterize networks. As a proof of concept, we generate an ensemble of time series ranging from periodic to random and confirm that application of the proposed map retains much of the information encoded in the original time series (or networks) after application of the map (or its inverse). Our results suggest that network analysis can be used to distinguish different dynamic regimes in time series and, perhaps more importantly, time series analysis can provide a powerful set of tools that augment the traditional network analysis toolkit to quantify networks in new and useful ways. PMID:21858093
Adolescent Immunization Coverage and Implementation of New School Requirements in Michigan, 2010
DeVita, Stefanie F.; Vranesich, Patricia A.; Boulton, Matthew L.
2014-01-01
Objectives. We examined the effect of Michigan’s new school rules and vaccine coadministration on time to completion of all the school-required vaccine series, the individual adolescent vaccines newly required for sixth grade in 2010, and initiation of the human papillomavirus (HPV) vaccine series, which was recommended but not required for girls. Methods. Data were derived from the Michigan Care Improvement Registry, a statewide Immunization Information System. We assessed the immunization status of Michigan children enrolled in sixth grade in 2009 or 2010. We used univariable and multivariable Cox regression models to identify significant associations between each factor and school completeness. Results. Enrollment in sixth grade in 2010 and coadministration of adolescent vaccines at the first adolescent visit were significantly associated with completion of the vaccines required for Michigan’s sixth graders. Children enrolled in sixth grade in 2010 had higher coverage with the newly required adolescent vaccines by age 13 years than did sixth graders in 2009, but there was little difference in the rate of HPV vaccine initiation among girls. Conclusions. Education and outreach efforts, particularly regarding the importance and benefits of coadministration of all recommended vaccines in adolescents, should be directed toward health care providers, parents, and adolescents. PMID:24922144
Glucose time series complexity as a predictor of type 2 diabetes
Rodríguez de Castro, Carmen; Vargas, Borja; García Delgado, Emilio; García Carretero, Rafael; Ruiz‐Galiana, Julián; Varela, Manuel
2016-01-01
Abstract Background Complexity analysis of glucose profile may provide valuable information about the gluco‐regulatory system. We hypothesized that a complexity metric (detrended fluctuation analysis, DFA) may have a prognostic value for the development of type 2 diabetes in patients at risk. Methods A total of 206 patients with any of the following risk factors (1) essential hypertension, (2) obesity or (3) a first‐degree relative with a diagnosis of diabetes were included in a survival analysis study for a diagnosis of new onset type 2 diabetes. At inclusion, a glucometry by means of a Continuous Glucose Monitoring System was performed, and DFA was calculated for a 24‐h glucose time series. Patients were then followed up every 6 months, controlling for the development of diabetes. Results In a median follow‐up of 18 months, there were 18 new cases of diabetes (58.5 cases/1000 patient‐years). DFA was a significant predictor for the development of diabetes, with ten events in the highest quartile versus one in the lowest (log‐rank test chi2 = 9, df = 1, p = 0.003), even after adjusting for other relevant clinical and biochemical variables. In a Cox model, the risk of diabetes development increased 2.8 times for every 0.1 DFA units. In a multivariate analysis, only fasting glucose, HbA1c and DFA emerged as significant factors. Conclusions Detrended fluctuation analysis significantly performed as a harbinger of type 2 diabetes development in a high‐risk population. Complexity analysis may help in targeting patients who could be candidates for intensified treatment. Copyright © 2016 The Authors. Diabetes/Metabolism Research and Reviews Published by John Wiley & Sons Ltd. PMID:27253149
Guiding Principles for a Pediatric Neurology ICU (neuroPICU) Bedside Multimodal Monitor
Eldar, Yonina C.; Gopher, Daniel; Gottlieb, Amihai; Lammfromm, Rotem; Mangat, Halinder S; Peleg, Nimrod; Pon, Steven; Rozenberg, Igal; Schiff, Nicholas D; Stark, David E; Yan, Peter; Pratt, Hillel; Kosofsky, Barry E
2016-01-01
Summary Background Physicians caring for children with serious acute neurologic disease must process overwhelming amounts of physiological and medical information. Strategies to optimize real time display of this information are understudied. Objectives Our goal was to engage clinical and engineering experts to develop guiding principles for creating a pediatric neurology intensive care unit (neuroPICU) monitor that integrates and displays data from multiple sources in an intuitive and informative manner. Methods To accomplish this goal, an international group of physicians and engineers communicated regularly for one year. We integrated findings from clinical observations, interviews, a survey, signal processing, and visualization exercises to develop a concept for a neuroPICU display. Results Key conclusions from our efforts include: (1) A neuroPICU display should support (a) rapid review of retrospective time series (i.e. cardiac, pulmonary, and neurologic physiology data), (b) rapidly modifiable formats for viewing that data according to the specialty of the reviewer, and (c) communication of the degree of risk of clinical decline. (2) Specialized visualizations of physiologic parameters can highlight abnormalities in multivariable temporal data. Examples include 3-D stacked spider plots and color coded time series plots. (3) Visual summaries of EEG with spectral tools (i.e. hemispheric asymmetry and median power) can highlight seizures via patient-specific “fingerprints.” (4) Intuitive displays should emphasize subsets of physiology and processed EEG data to provide a rapid gestalt of the current status and medical stability of a patient. Conclusions A well-designed neuroPICU display must present multiple datasets in dynamic, flexible, and informative views to accommodate clinicians from multiple disciplines in a variety of clinical scenarios. PMID:27437048
An Interactive Tool For Semi-automated Statistical Prediction Using Earth Observations and Models
NASA Astrophysics Data System (ADS)
Zaitchik, B. F.; Berhane, F.; Tadesse, T.
2015-12-01
We developed a semi-automated statistical prediction tool applicable to concurrent analysis or seasonal prediction of any time series variable in any geographic location. The tool was developed using Shiny, JavaScript, HTML and CSS. A user can extract a predictand by drawing a polygon over a region of interest on the provided user interface (global map). The user can select the Climatic Research Unit (CRU) precipitation or Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) as predictand. They can also upload their own predictand time series. Predictors can be extracted from sea surface temperature, sea level pressure, winds at different pressure levels, air temperature at various pressure levels, and geopotential height at different pressure levels. By default, reanalysis fields are applied as predictors, but the user can also upload their own predictors, including a wide range of compatible satellite-derived datasets. The package generates correlations of the variables selected with the predictand. The user also has the option to generate composites of the variables based on the predictand. Next, the user can extract predictors by drawing polygons over the regions that show strong correlations (composites). Then, the user can select some or all of the statistical prediction models provided. Provided models include Linear Regression models (GLM, SGLM), Tree-based models (bagging, random forest, boosting), Artificial Neural Network, and other non-linear models such as Generalized Additive Model (GAM) and Multivariate Adaptive Regression Splines (MARS). Finally, the user can download the analysis steps they used, such as the region they selected, the time period they specified, the predictand and predictors they chose and preprocessing options they used, and the model results in PDF or HTML format. Key words: Semi-automated prediction, Shiny, R, GLM, ANN, RF, GAM, MARS
Grinspan, Zachary M; Eldar, Yonina C; Gopher, Daniel; Gottlieb, Amihai; Lammfromm, Rotem; Mangat, Halinder S; Peleg, Nimrod; Pon, Steven; Rozenberg, Igal; Schiff, Nicholas D; Stark, David E; Yan, Peter; Pratt, Hillel; Kosofsky, Barry E
2016-01-01
Physicians caring for children with serious acute neurologic disease must process overwhelming amounts of physiological and medical information. Strategies to optimize real time display of this information are understudied. Our goal was to engage clinical and engineering experts to develop guiding principles for creating a pediatric neurology intensive care unit (neuroPICU) monitor that integrates and displays data from multiple sources in an intuitive and informative manner. To accomplish this goal, an international group of physicians and engineers communicated regularly for one year. We integrated findings from clinical observations, interviews, a survey, signal processing, and visualization exercises to develop a concept for a neuroPICU display. Key conclusions from our efforts include: (1) A neuroPICU display should support (a) rapid review of retrospective time series (i.e. cardiac, pulmonary, and neurologic physiology data), (b) rapidly modifiable formats for viewing that data according to the specialty of the reviewer, and (c) communication of the degree of risk of clinical decline. (2) Specialized visualizations of physiologic parameters can highlight abnormalities in multivariable temporal data. Examples include 3-D stacked spider plots and color coded time series plots. (3) Visual summaries of EEG with spectral tools (i.e. hemispheric asymmetry and median power) can highlight seizures via patient-specific "fingerprints." (4) Intuitive displays should emphasize subsets of physiology and processed EEG data to provide a rapid gestalt of the current status and medical stability of a patient. A well-designed neuroPICU display must present multiple datasets in dynamic, flexible, and informative views to accommodate clinicians from multiple disciplines in a variety of clinical scenarios.