multivariate models based: Topics by Science.gov

Sample records for multivariate models based

Stochastic modelling of temperatures affecting the in situ performance of a solar-assisted heat pump: The multivariate approach and physical interpretation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loveday, D.L.; Craggs, C.

Box-Jenkins-based multivariate stochastic modeling is carried out using data recorded from a domestic heating system. The system comprises an air-source heat pump sited in the roof space of a house, solar assistance being provided by the conventional tile roof acting as a radiation absorber. Multivariate models are presented which illustrate the time-dependent relationships between three air temperatures - at external ambient, at entry to, and at exit from, the heat pump evaporator. Using a deterministic modeling approach, physical interpretations are placed on the results of the multivariate technique. It is concluded that the multivariate Box-Jenkins approach is a suitable techniquemore » for building thermal analysis. Application to multivariate Box-Jenkins approach is a suitable technique for building thermal analysis. Application to multivariate model-based control is discussed, with particular reference to building energy management systems. It is further concluded that stochastic modeling of data drawn from a short monitoring period offers a means of retrofitting an advanced model-based control system in existing buildings, which could be used to optimize energy savings. An approach to system simulation is suggested.« less
An error bound for a discrete reduced order model of a linear multivariable system

NASA Technical Reports Server (NTRS)

Al-Saggaf, Ubaid M.; Franklin, Gene F.

1987-01-01

The design of feasible controllers for high dimension multivariable systems can be greatly aided by a method of model reduction. In order for the design based on the order reduction to include a guarantee of stability, it is sufficient to have a bound on the model error. Previous work has provided such a bound for continuous-time systems for algorithms based on balancing. In this note an L-infinity bound is derived for model error for a method of order reduction of discrete linear multivariable systems based on balancing.
Characterizing multivariate decoding models based on correlated EEG spectral features

PubMed Central

McFarland, Dennis J.

2013-01-01

Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions versus Multivariate Polytomous Ability Distributions. Research Report. ETS RR-08-45

ERIC Educational Resources Information Center

Haberman, Shelby J.; von Davier, Matthias; Lee, Yi-Hsuan

2008-01-01

Multidimensional item response models can be based on multivariate normal ability distributions or on multivariate polytomous ability distributions. For the case of simple structure in which each item corresponds to a unique dimension of the ability vector, some applications of the two-parameter logistic model to empirical data are employed to…
Bayesian inference on risk differences: an application to multivariate meta-analysis of adverse events in clinical trials.

PubMed

Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng

2013-05-01

Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models

ERIC Educational Resources Information Center

Price, Larry R.

2012-01-01

The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Characterizing multivariate decoding models based on correlated EEG spectral features.

PubMed

McFarland, Dennis J

2013-07-01

Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
A multivariate model and statistical method for validating tree grade lumber yield equations

Treesearch

Donald W. Seegrist

1975-01-01

Lumber yields within lumber grades can be described by a multivariate linear model. A method for validating lumber yield prediction equations when there are several tree grades is presented. The method is based on multivariate simultaneous test procedures.
A new multivariate zero-adjusted Poisson model with applications to biomedicine.

PubMed

Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

2018-05-25

Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
On the Numerical Formulation of Parametric Linear Fractional Transformation (LFT) Uncertainty Models for Multivariate Matrix Polynomial Problems

NASA Technical Reports Server (NTRS)

Belcastro, Christine M.

1998-01-01

Robust control system analysis and design is based on an uncertainty description, called a linear fractional transformation (LFT), which separates the uncertain (or varying) part of the system from the nominal system. These models are also useful in the design of gain-scheduled control systems based on Linear Parameter Varying (LPV) methods. Low-order LFT models are difficult to form for problems involving nonlinear parameter variations. This paper presents a numerical computational method for constructing and LFT model for a given LPV model. The method is developed for multivariate polynomial problems, and uses simple matrix computations to obtain an exact low-order LFT representation of the given LPV system without the use of model reduction. Although the method is developed for multivariate polynomial problems, multivariate rational problems can also be solved using this method by reformulating the rational problem into a polynomial form.
Web-Based Tools for Modelling and Analysis of Multivariate Data: California Ozone Pollution Activity

ERIC Educational Resources Information Center

Dinov, Ivo D.; Christou, Nicolas

2011-01-01

This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting…
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.

PubMed

Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin

2018-03-08

The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
Application of multivariate Gaussian detection theory to known non-Gaussian probability density functions

NASA Astrophysics Data System (ADS)

Schwartz, Craig R.; Thelen, Brian J.; Kenton, Arthur C.

1995-06-01

A statistical parametric multispectral sensor performance model was developed by ERIM to support mine field detection studies, multispectral sensor design/performance trade-off studies, and target detection algorithm development. The model assumes target detection algorithms and their performance models which are based on data assumed to obey multivariate Gaussian probability distribution functions (PDFs). The applicability of these algorithms and performance models can be generalized to data having non-Gaussian PDFs through the use of transforms which convert non-Gaussian data to Gaussian (or near-Gaussian) data. An example of one such transform is the Box-Cox power law transform. In practice, such a transform can be applied to non-Gaussian data prior to the introduction of a detection algorithm that is formally based on the assumption of multivariate Gaussian data. This paper presents an extension of these techniques to the case where the joint multivariate probability density function of the non-Gaussian input data is known, and where the joint estimate of the multivariate Gaussian statistics, under the Box-Cox transform, is desired. The jointly estimated multivariate Gaussian statistics can then be used to predict the performance of a target detection algorithm which has an associated Gaussian performance model.
A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research.

PubMed

Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; Nookala, Lavanya; Day, Michele E; Kim, Katherine K; Kim, Hyeoneui; Boxwala, Aziz; El-Kareh, Robert; Kuo, Grace M; Resnic, Frederic S; Kesselman, Carl; Ohno-Machado, Lucila

2015-11-01

Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China

PubMed Central

Pei, Ling-Ling; Li, Qin

2018-01-01

The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985
Can multivariate models based on MOAKS predict OA knee pain? Data from the Osteoarthritis Initiative

NASA Astrophysics Data System (ADS)

Luna-Gómez, Carlos D.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Galván-Tejada, Carlos E.; Celaya-Padilla, José M.

2017-03-01

Osteoarthritis is the most common rheumatic disease in the world. Knee pain is the most disabling symptom in the disease, the prediction of pain is one of the targets in preventive medicine, this can be applied to new therapies or treatments. Using the magnetic resonance imaging and the grading scales, a multivariate model based on genetic algorithms is presented. Using a predictive model can be useful to associate minor structure changes in the joint with the future knee pain. Results suggest that multivariate models can be predictive with future knee chronic pain. All models; T0, T1 and T2, were statistically significant, all p values were < 0.05 and all AUC > 0.60.
Applying the multivariate time-rescaling theorem to neural population models

PubMed Central

Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon

2011-01-01

Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436
Multivariable Parametric Cost Model for Ground Optical Telescope Assembly

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia

2005-01-01

A parametric cost model for ground-based telescopes is developed using multivariable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction-limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature are examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e., multi-telescope phased-array systems). Additionally, single variable models Based on aperture diameter are derived.
Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models.

PubMed

Chen, Zewei; Zhang, Xin; Zhang, Zhuoyong

2016-12-01

Timely risk assessment of chronic kidney disease (CKD) and proper community-based CKD monitoring are important to prevent patients with potential risk from further kidney injuries. As many symptoms are associated with the progressive development of CKD, evaluating risk of CKD through a set of clinical data of symptoms coupled with multivariate models can be considered as an available method for prevention of CKD and would be useful for community-based CKD monitoring. Three common used multivariate models, i.e., K-nearest neighbor (KNN), support vector machine (SVM), and soft independent modeling of class analogy (SIMCA), were used to evaluate risk of 386 patients based on a series of clinical data taken from UCI machine learning repository. Different types of composite data, in which proportional disturbances were added to simulate measurement deviations caused by environment and instrument noises, were also utilized to evaluate the feasibility and robustness of these models in risk assessment of CKD. For the original data set, three mentioned multivariate models can differentiate patients with CKD and non-CKD with the overall accuracies over 93 %. KNN and SVM have better performances than SIMCA has in this study. For the composite data set, SVM model has the best ability to tolerate noise disturbance and thus are more robust than the other two models. Using clinical data set on symptoms coupled with multivariate models has been proved to be feasible approach for assessment of patient with potential CKD risk. SVM model can be used as useful and robust tool in this study.
Multivariable Parametric Cost Model for Ground Optical: Telescope Assembly

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia

2004-01-01

A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature were examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter were derived.

Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

PubMed Central

Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian

2015-01-01

In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213
Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI

PubMed Central

Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.

2015-01-01

In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490
Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

PubMed

Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

2018-06-29

A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments

PubMed Central

Avalappampatty Sivasamy, Aneetha; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments.

PubMed

Sivasamy, Aneetha Avalappampatty; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

PubMed

Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

2017-05-07

In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

NASA Astrophysics Data System (ADS)

Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

2017-05-01

In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
A time domain frequency-selective multivariate Granger causality approach.

PubMed

Leistritz, Lutz; Witte, Herbert

2016-08-01

The investigation of effective connectivity is one of the major topics in computational neuroscience to understand the interaction between spatially distributed neuronal units of the brain. Thus, a wide variety of methods has been developed during the last decades to investigate functional and effective connectivity in multivariate systems. Their spectrum ranges from model-based to model-free approaches with a clear separation into time and frequency range methods. We present in this simulation study a novel time domain approach based on Granger's principle of predictability, which allows frequency-selective considerations of directed interactions. It is based on a comparison of prediction errors of multivariate autoregressive models fitted to systematically modified time series. These modifications are based on signal decompositions, which enable a targeted cancellation of specific signal components with specific spectral properties. Depending on the embedded signal decomposition method, a frequency-selective or data-driven signal-adaptive Granger Causality Index may be derived.
Chemiluminescence-based multivariate sensing of local equivalence ratios in premixed atmospheric methane-air flames

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.

Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
Decomposing biodiversity data using the Latent Dirichlet Allocation model, a probabilistic multivariate statistical method

Treesearch

Denis Valle; Benjamin Baiser; Christopher W. Woodall; Robin Chazdon; Jerome Chave

2014-01-01

We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates...
Multiple imputation for handling missing outcome data when estimating the relative risk.

PubMed

Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B

2017-09-06

Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Preliminary Multivariable Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip

2010-01-01

Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. Previously, the authors published two single variable cost models based on 19 flight missions. The current paper presents the development of a multi-variable space telescopes cost model. The validity of previously published models are tested. Cost estimating relationships which are and are not significant cost drivers are identified. And, interrelationships between variables are explored
A Multivariate Descriptive Model of Motivation for Orthodontic Treatment.

ERIC Educational Resources Information Center

Hackett, Paul M. W.; And Others

1993-01-01

Motivation for receiving orthodontic treatment was studied among 109 young adults, and a multivariate model of the process is proposed. The combination of smallest scale analysis and Partial Order Scalogram Analysis by base Coordinates (POSAC) illustrates an interesting methodology for health treatment studies and explores motivation for dental…
A Cyber-Attack Detection Model Based on Multivariate Analyses

NASA Astrophysics Data System (ADS)

Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Multivariate Non-Symmetric Stochastic Models for Spatial Dependence Models

NASA Astrophysics Data System (ADS)

Haslauer, C. P.; Bárdossy, A.

2017-12-01

A copula based multivariate framework allows more flexibility to describe different kind of dependences than what is possible using models relying on the confining assumption of symmetric Gaussian models: different quantiles can be modelled with a different degree of dependence; it will be demonstrated how this can be expected given process understanding. maximum likelihood based multivariate quantitative parameter estimation yields stable and reliable results; not only improved results in cross-validation based measures of uncertainty are obtained but also a more realistic spatial structure of uncertainty compared to second order models of dependence; as much information as is available is included in the parameter estimation: incorporation of censored measurements (e.g., below detection limit, or ones that are above the sensitive range of the measurement device) yield to more realistic spatial models; the proportion of true zeros can be jointly estimated with and distinguished from censored measurements which allow estimates about the age of a contaminant in the system; secondary information (categorical and on the rational scale) has been used to improve the estimation of the primary variable; These copula based multivariate statistical techniques are demonstrated based on hydraulic conductivity observations at the Borden (Canada) site, the MADE site (USA), and a large regional groundwater quality data-set in south-west Germany. Fields of spatially distributed K were simulated with identical marginal simulation, identical second order spatial moments, yet substantially differing solute transport characteristics when numerical tracer tests were performed. A statistical methodology is shown that allows the delineation of a boundary layer separating homogenous parts of a spatial data-set. The effects of this boundary layer (macro structure) and the spatial dependence of K (micro structure) on solute transport behaviour is shown.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

PubMed

Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

2015-04-01

Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Cross-country transferability of multi-variable damage models

NASA Astrophysics Data System (ADS)

Wagenaar, Dennis; Lüdtke, Stefan; Kreibich, Heidi; Bouwer, Laurens

2017-04-01

Flood damage assessment is often done with simple damage curves based only on flood water depth. Additionally, damage models are often transferred in space and time, e.g. from region to region or from one flood event to another. Validation has shown that depth-damage curve estimates are associated with high uncertainties, particularly when applied in regions outside the area where the data for curve development was collected. Recently, progress has been made with multi-variable damage models created with data-mining techniques, i.e. Bayesian Networks and random forest. However, it is still unknown to what extent and under which conditions model transfers are possible and reliable. Model validations in different countries will provide valuable insights into the transferability of multi-variable damage models. In this study we compare multi-variable models developed on basis of flood damage datasets from Germany as well as from The Netherlands. Data from several German floods was collected using computer aided telephone interviews. Data from the 1993 Meuse flood in the Netherlands is available, based on compensations paid by the government. The Bayesian network and random forest based models are applied and validated in both countries on basis of the individual datasets. A major challenge was the harmonization of the variables between both datasets due to factors like differences in variable definitions, and regional and temporal differences in flood hazard and exposure characteristics. Results of model validations and comparisons in both countries are discussed, particularly in respect to encountered challenges and possible solutions for an improvement of model transferability.
Multivariate regression model for partitioning tree volume of white oak into round-product classes

Treesearch

Daniel A. Yaussy; David L. Sonderman

1984-01-01

Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
Optimal moment determination in POME-copula based hydrometeorological dependence modelling

NASA Astrophysics Data System (ADS)

Liu, Dengfeng; Wang, Dong; Singh, Vijay P.; Wang, Yuankun; Wu, Jichun; Wang, Lachun; Zou, Xinqing; Chen, Yuanfang; Chen, Xi

2017-07-01

Copula has been commonly applied in multivariate modelling in various fields where marginal distribution inference is a key element. To develop a flexible, unbiased mathematical inference framework in hydrometeorological multivariate applications, the principle of maximum entropy (POME) is being increasingly coupled with copula. However, in previous POME-based studies, determination of optimal moment constraints has generally not been considered. The main contribution of this study is the determination of optimal moments for POME for developing a coupled optimal moment-POME-copula framework to model hydrometeorological multivariate events. In this framework, margins (marginals, or marginal distributions) are derived with the use of POME, subject to optimal moment constraints. Then, various candidate copulas are constructed according to the derived margins, and finally the most probable one is determined, based on goodness-of-fit statistics. This optimal moment-POME-copula framework is applied to model the dependence patterns of three types of hydrometeorological events: (i) single-site streamflow-water level; (ii) multi-site streamflow; and (iii) multi-site precipitation, with data collected from Yichang and Hankou in the Yangtze River basin, China. Results indicate that the optimal-moment POME is more accurate in margin fitting and the corresponding copulas reflect a good statistical performance in correlation simulation. Also, the derived copulas, capturing more patterns which traditional correlation coefficients cannot reflect, provide an efficient way in other applied scenarios concerning hydrometeorological multivariate modelling.
A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2013-01-01

Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213

Multivariate missing data in hydrology - Review and applications

NASA Astrophysics Data System (ADS)

Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.

2017-12-01

Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

NASA Astrophysics Data System (ADS)

Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

2009-08-01

In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Centralized PI control for high dimensional multivariable systems based on equivalent transfer function.

PubMed

Luan, Xiaoli; Chen, Qiang; Liu, Fei

2014-09-01

This article presents a new scheme to design full matrix controller for high dimensional multivariable processes based on equivalent transfer function (ETF). Differing from existing ETF method, the proposed ETF is derived directly by exploiting the relationship between the equivalent closed-loop transfer function and the inverse of open-loop transfer function. Based on the obtained ETF, the full matrix controller is designed utilizing the existing PI tuning rules. The new proposed ETF model can more accurately represent the original processes. Furthermore, the full matrix centralized controller design method proposed in this paper is applicable to high dimensional multivariable systems with satisfactory performance. Comparison with other multivariable controllers shows that the designed ETF based controller is superior with respect to design-complexity and obtained performance. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water

USDA-ARS?s Scientific Manuscript database

Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Pain, pain intensity and pain disability in high school students are differently associated with physical activity, screening hours and sleep.

PubMed

Silva, Anabela G; Sa-Couto, Pedro; Queirós, Alexandra; Neto, Maritza; Rocha, Nelson P

2017-05-16

Studies exploring the association between physical activity, screen time and sleep and pain usually focus on a limited number of painful body sites. Nevertheless, pain at different body sites is likely to be of different nature. Therefore, this study aims to explore and compare the association between time spent in self-reported physical activity, in screen based activities and sleeping and i) pain presence in the last 7-days for 9 different body sites; ii) pain intensity at 9 different body sites and iii) global disability. Nine hundred sixty nine students completed a questionnaire on pain, time spent in moderate and vigorous physical activity, screen based time watching TV/DVD, playing, using mobile phones and computers and sleeping hours. Univariate and multivariate associations between pain presence, pain intensity and disability and physical activity, screen based time and sleeping hours were investigated. Pain presence: sleeping remained in the multivariable model for the neck, mid back, wrists, knees and ankles/feet (OR 1.17 to 2.11); moderate physical activity remained in the multivariate model for the neck, shoulders, wrists, hips and ankles/feet (OR 1.06 to 1.08); vigorous physical activity remained in the multivariate model for mid back, knees and ankles/feet (OR 1.05 to 1.09) and screen time remained in the multivariate model for the low back (OR = 2.34. Pain intensity: screen time and moderate physical activity remained in the multivariable model for pain intensity at the neck, mid back, low back, shoulder, knees and ankles/feet (Rp 2 0.02 to 0.04) and at the wrists (Rp 2 = 0.04), respectively. Disability showed no association with sleeping, screen time or physical activity. This study suggests both similarities and differences in the patterns of association between time spent in physical activity, sleeping and in screen based activities and pain presence at 8 different body sites. In addition, they also suggest that the factors associated with the presence of pain, pain intensity and pain associated disability are different.
Multivariate optical computing using a digital micromirror device for fluorescence and Raman spectroscopy.

PubMed

Smith, Zachary J; Strombom, Sven; Wachsmann-Hogiu, Sebastian

2011-08-29

A multivariate optical computer has been constructed consisting of a spectrograph, digital micromirror device, and photomultiplier tube that is capable of determining absolute concentrations of individual components of a multivariate spectral model. We present experimental results on ternary mixtures, showing accurate quantification of chemical concentrations based on integrated intensities of fluorescence and Raman spectra measured with a single point detector. We additionally show in simulation that point measurements based on principal component spectra retain the ability to classify cancerous from noncancerous T cells.
From point process observations to collective neural dynamics: Nonlinear Hawkes process GLMs, low-dimensional dynamics and coarse graining

PubMed Central

Truccolo, Wilson

2017-01-01

This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics (“order parameters”) inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. PMID:28336305
From point process observations to collective neural dynamics: Nonlinear Hawkes process GLMs, low-dimensional dynamics and coarse graining.

PubMed

Truccolo, Wilson

2016-11-01

This review presents a perspective on capturing collective dynamics in recorded neuronal ensembles based on multivariate point process models, inference of low-dimensional dynamics and coarse graining of spatiotemporal measurements. A general probabilistic framework for continuous time point processes reviewed, with an emphasis on multivariate nonlinear Hawkes processes with exogenous inputs. A point process generalized linear model (PP-GLM) framework for the estimation of discrete time multivariate nonlinear Hawkes processes is described. The approach is illustrated with the modeling of collective dynamics in neocortical neuronal ensembles recorded in human and non-human primates, and prediction of single-neuron spiking. A complementary approach to capture collective dynamics based on low-dimensional dynamics ("order parameters") inferred via latent state-space models with point process observations is presented. The approach is illustrated by inferring and decoding low-dimensional dynamics in primate motor cortex during naturalistic reach and grasp movements. Finally, we briefly review hypothesis tests based on conditional inference and spatiotemporal coarse graining for assessing collective dynamics in recorded neuronal ensembles. Published by Elsevier Ltd.
Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network

DOE PAGES

Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong; ...

2017-12-18

Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong

Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
Practical robustness measures in multivariable control system analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Lehtomaki, N. A.

1981-01-01

The robustness of the stability of multivariable linear time invariant feedback control systems with respect to model uncertainty is considered using frequency domain criteria. Available robustness tests are unified under a common framework based on the nature and structure of model errors. These results are derived using a multivariable version of Nyquist's stability theorem in which the minimum singular value of the return difference transfer matrix is shown to be the multivariable generalization of the distance to the critical point on a single input, single output Nyquist diagram. Using the return difference transfer matrix, a very general robustness theorem is presented from which all of the robustness tests dealing with specific model errors may be derived. The robustness tests that explicitly utilized model error structure are able to guarantee feedback system stability in the face of model errors of larger magnitude than those robustness tests that do not. The robustness of linear quadratic Gaussian control systems are analyzed.
Finding Groups Using Model-Based Cluster Analysis: Heterogeneous Emotional Self-Regulatory Processes and Heavy Alcohol Use Risk

ERIC Educational Resources Information Center

Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

2008-01-01

Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

NASA Astrophysics Data System (ADS)

Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

2018-03-01

This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Multi-country health surveys: are the analyses misleading?

PubMed

Masood, Mohd; Reidpath, Daniel D

2014-05-01

The aim of this paper was to review the types of approaches currently utilized in the analysis of multi-country survey data, specifically focusing on design and modeling issues with a focus on analyses of significant multi-country surveys published in 2010. A systematic search strategy was used to identify the 10 multi-country surveys and the articles published from them in 2010. The surveys were selected to reflect diverse topics and foci; and provide an insight into analytic approaches across research themes. The search identified 159 articles appropriate for full text review and data extraction. The analyses adopted in the multi-country surveys can be broadly classified as: univariate/bivariate analyses, and multivariate/multivariable analyses. Multivariate/multivariable analyses may be further divided into design- and model-based analyses. Of the 159 articles reviewed, 129 articles used model-based analysis, 30 articles used design-based analyses. Similar patterns could be seen in all the individual surveys. While there is general agreement among survey statisticians that complex surveys are most appropriately analyzed using design-based analyses, most researchers continued to use the more common model-based approaches. Recent developments in design-based multi-level analysis may be one approach to include all the survey design characteristics. This is a relatively new area, however, and there remains statistical, as well as applied analytic research required. An important limitation of this study relates to the selection of the surveys used and the choice of year for the analysis, i.e., year 2010 only. There is, however, no strong reason to believe that analytic strategies have changed radically in the past few years, and 2010 provides a credible snapshot of current practice.
Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

PubMed

Dinç, Erdal; Ozdemir, Abdil

2005-01-01

Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
Critical elements on fitting the Bayesian multivariate Poisson Lognormal model

NASA Astrophysics Data System (ADS)

Zamzuri, Zamira Hasanah binti

2015-10-01

Motivated by a problem on fitting multivariate models to traffic accident data, a detailed discussion of the Multivariate Poisson Lognormal (MPL) model is presented. This paper reveals three critical elements on fitting the MPL model: the setting of initial estimates, hyperparameters and tuning parameters. These issues have not been highlighted in the literature. Based on simulation studies conducted, we have shown that to use the Univariate Poisson Model (UPM) estimates as starting values, at least 20,000 iterations are needed to obtain reliable final estimates. We also illustrated the sensitivity of the specific hyperparameter, which if it is not given extra attention, may affect the final estimates. The last issue is regarding the tuning parameters where they depend on the acceptance rate. Finally, a heuristic algorithm to fit the MPL model is presented. This acts as a guide to ensure that the model works satisfactorily given any data set.
Lameness detection in dairy cattle: single predictor v. multivariate analysis of image-based posture processing and behaviour and performance sensing.

PubMed

Van Hertem, T; Bahr, C; Schlageter Tello, A; Viazzi, S; Steensels, M; Romanini, C E B; Lokhorst, C; Maltz, E; Halachmi, I; Berckmans, D

2016-09-01

The objective of this study was to evaluate if a multi-sensor system (milk, activity, body posture) was a better classifier for lameness than the single-sensor-based detection models. Between September 2013 and August 2014, 3629 cow observations were collected on a commercial dairy farm in Belgium. Human locomotion scoring was used as reference for the model development and evaluation. Cow behaviour and performance was measured with existing sensors that were already present at the farm. A prototype of three-dimensional-based video recording system was used to quantify automatically the back posture of a cow. For the single predictor comparisons, a receiver operating characteristics curve was made. For the multivariate detection models, logistic regression and generalized linear mixed models (GLMM) were developed. The best lameness classification model was obtained by the multi-sensor analysis (area under the receiver operating characteristics curve (AUC)=0.757±0.029), containing a combination of milk and milking variables, activity and gait and posture variables from videos. Second, the multivariate video-based system (AUC=0.732±0.011) performed better than the multivariate milk sensors (AUC=0.604±0.026) and the multivariate behaviour sensors (AUC=0.633±0.018). The video-based system performed better than the combined behaviour and performance-based detection model (AUC=0.669±0.028), indicating that it is worthwhile to consider a video-based lameness detection system, regardless the presence of other existing sensors in the farm. The results suggest that Θ2, the feature variable for the back curvature around the hip joints, with an AUC of 0.719 is the best single predictor variable for lameness detection based on locomotion scoring. In general, this study showed that the video-based back posture monitoring system is outperforming the behaviour and performance sensing techniques for locomotion scoring-based lameness detection. A GLMM with seven specific variables (walking speed, back posture measurement, daytime activity, milk yield, lactation stage, milk peak flow rate and milk peak conductivity) is the best combination of variables for lameness classification. The accuracy on four-level lameness classification was 60.3%. The accuracy improved to 79.8% for binary lameness classification. The binary GLMM obtained a sensitivity of 68.5% and a specificity of 87.6%, which both exceed the sensitivity (52.1%±4.7%) and specificity (83.2%±2.3%) of the multi-sensor logistic regression model. This shows that the repeated measures analysis in the GLMM, taking into account the individual history of the animal, outperforms the classification when thresholds based on herd level (a statistical population) are used.
Up-scaling of multi-variable flood loss models from objects to land use units at the meso-scale

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Schröter, Kai; Merz, Bruno

2016-05-01

Flood risk management increasingly relies on risk analyses, including loss modelling. Most of the flood loss models usually applied in standard practice have in common that complex damaging processes are described by simple approaches like stage-damage functions. Novel multi-variable models significantly improve loss estimation on the micro-scale and may also be advantageous for large-scale applications. However, more input parameters also reveal additional uncertainty, even more in upscaling procedures for meso-scale applications, where the parameters need to be estimated on a regional area-wide basis. To gain more knowledge about challenges associated with the up-scaling of multi-variable flood loss models the following approach is applied: Single- and multi-variable micro-scale flood loss models are up-scaled and applied on the meso-scale, namely on basis of ATKIS land-use units. Application and validation is undertaken in 19 municipalities, which were affected during the 2002 flood by the River Mulde in Saxony, Germany by comparison to official loss data provided by the Saxon Relief Bank (SAB).In the meso-scale case study based model validation, most multi-variable models show smaller errors than the uni-variable stage-damage functions. The results show the suitability of the up-scaling approach, and, in accordance with micro-scale validation studies, that multi-variable models are an improvement in flood loss modelling also on the meso-scale. However, uncertainties remain high, stressing the importance of uncertainty quantification. Thus, the development of probabilistic loss models, like BT-FLEMO used in this study, which inherently provide uncertainty information are the way forward.
Bayesian transformation cure frailty models with multivariate failure time data.

PubMed

Yin, Guosheng

2008-12-10

We propose a class of transformation cure frailty models to accommodate a survival fraction in multivariate failure time data. Established through a general power transformation, this family of cure frailty models includes the proportional hazards and the proportional odds modeling structures as two special cases. Within the Bayesian paradigm, we obtain the joint posterior distribution and the corresponding full conditional distributions of the model parameters for the implementation of Gibbs sampling. Model selection is based on the conditional predictive ordinate statistic and deviance information criterion. As an illustration, we apply the proposed method to a real data set from dentistry.
Multivariate Formation Pressure Prediction with Seismic-derived Petrophysical Properties from Prestack AVO inversion and Poststack Seismic Motion Inversion

NASA Astrophysics Data System (ADS)

Yu, H.; Gu, H.

2017-12-01

A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.

Design, evaluation and test of an electronic, multivariable control for the F100 turbofan engine

NASA Technical Reports Server (NTRS)

Skira, C. A.; Dehoff, R. L.; Hall, W. E., Jr.

1980-01-01

A digital, multivariable control design procedure for the F100 turbofan engine is described. The controller is based on locally linear synthesis techniques using linear, quadratic regulator design methods. The control structure uses an explicit model reference form with proportional and integral feedback near a nominal trajectory. Modeling issues, design procedures for the control law and the estimation of poorly measured variables are presented.
TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies

PubMed Central

van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.

2013-01-01

To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524
Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

PubMed

MacNab, Ying C

2016-08-01

This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.
Analysis of Forest Foliage Using a Multivariate Mixture Model

NASA Technical Reports Server (NTRS)

Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.

1997-01-01

Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
[Monitoring method of extraction process for Schisandrae Chinensis Fructus based on near infrared spectroscopy and multivariate statistical process control].

PubMed

Xu, Min; Zhang, Lei; Yue, Hong-Shui; Pang, Hong-Wei; Ye, Zheng-Liang; Ding, Li

2017-10-01

To establish an on-line monitoring method for extraction process of Schisandrae Chinensis Fructus, the formula medicinal material of Yiqi Fumai lyophilized injection by combining near infrared spectroscopy with multi-variable data analysis technology. The multivariate statistical process control (MSPC) model was established based on 5 normal batches in production and 2 test batches were monitored by PC scores, DModX and Hotelling T2 control charts. The results showed that MSPC model had a good monitoring ability for the extraction process. The application of the MSPC model to actual production process could effectively achieve on-line monitoring for extraction process of Schisandrae Chinensis Fructus, and can reflect the change of material properties in the production process in real time. This established process monitoring method could provide reference for the application of process analysis technology in the process quality control of traditional Chinese medicine injections. Copyright© by the Chinese Pharmaceutical Association.
Visual Environment for Rich Data Interpretation (VERDI) program for environmental modeling systems

EPA Pesticide Factsheets

VERDI is a flexible, modular, Java-based program used for visualizing multivariate gridded meteorology, emissions and air quality modeling data created by environmental modeling systems such as the CMAQ model and WRF.
Identification of multivariable nonlinear systems in the presence of colored noises using iterative hierarchical least squares algorithm.

PubMed

Jafari, Masoumeh; Salimifard, Maryam; Dehghani, Maryam

2014-07-01

This paper presents an efficient method for identification of nonlinear Multi-Input Multi-Output (MIMO) systems in the presence of colored noises. The method studies the multivariable nonlinear Hammerstein and Wiener models, in which, the nonlinear memory-less block is approximated based on arbitrary vector-based basis functions. The linear time-invariant (LTI) block is modeled by an autoregressive moving average with exogenous (ARMAX) model which can effectively describe the moving average noises as well as the autoregressive and the exogenous dynamics. According to the multivariable nature of the system, a pseudo-linear-in-the-parameter model is obtained which includes two different kinds of unknown parameters, a vector and a matrix. Therefore, the standard least squares algorithm cannot be applied directly. To overcome this problem, a Hierarchical Least Squares Iterative (HLSI) algorithm is used to simultaneously estimate the vector and the matrix of unknown parameters as well as the noises. The efficiency of the proposed identification approaches are investigated through three nonlinear MIMO case studies. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Seizure-Onset Mapping Based on Time-Variant Multivariate Functional Connectivity Analysis of High-Dimensional Intracranial EEG: A Kalman Filter Approach.

PubMed

Lie, Octavian V; van Mierlo, Pieter

2017-01-01

The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (<60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach.
Atrial Electrogram Fractionation Distribution before and after Pulmonary Vein Isolation in Human Persistent Atrial Fibrillation-A Retrospective Multivariate Statistical Analysis.

PubMed

Almeida, Tiago P; Chu, Gavin S; Li, Xin; Dastagir, Nawshin; Tuan, Jiun H; Stafford, Peter J; Schlindwein, Fernando S; Ng, G André

2017-01-01

Purpose: Complex fractionated atrial electrograms (CFAE)-guided ablation after pulmonary vein isolation (PVI) has been used for persistent atrial fibrillation (persAF) therapy. This strategy has shown suboptimal outcomes due to, among other factors, undetected changes in the atrial tissue following PVI. In the present work, we investigate CFAE distribution before and after PVI in patients with persAF using a multivariate statistical model. Methods: 207 pairs of atrial electrograms (AEGs) were collected before and after PVI respectively, from corresponding LA regions in 18 persAF patients. Twelve attributes were measured from the AEGs, before and after PVI. Statistical models based on multivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) have been used to characterize the atrial regions and AEGs. Results: PVI significantly reduced CFAEs in the LA (70 vs. 40%; P < 0.0001). Four types of LA regions were identified, based on the AEGs characteristics: (i) fractionated before PVI that remained fractionated after PVI (31% of the collected points); (ii) fractionated that converted to normal (39%); (iii) normal prior to PVI that became fractionated (9%) and; (iv) normal that remained normal (21%). Individually, the attributes failed to distinguish these LA regions, but multivariate statistical models were effective in their discrimination ( P < 0.0001). Conclusion: Our results have unveiled that there are LA regions resistant to PVI, while others are affected by it. Although, traditional methods were unable to identify these different regions, the proposed multivariate statistical model discriminated LA regions resistant to PVI from those affected by it without prior ablation information.
Load compensation in a lean burn natural gas vehicle

NASA Astrophysics Data System (ADS)

Gangopadhyay, Anupam

A new multivariable PI tuning technique is developed in this research that is primarily developed for regulation purposes. Design guidelines are developed based on closed-loop stability. The new multivariable design is applied in a natural gas vehicle to combine idle and A/F ratio control loops. This results in better recovery during low idle operation of a vehicle under external step torques. A powertrain model of a natural gas engine is developed and validated for steady-state and transient operation. The nonlinear model has three states: engine speed, intake manifold pressure and fuel fraction in the intake manifold. The model includes the effect of fuel partial pressure in the intake manifold filling and emptying dynamics. Due to the inclusion of fuel fraction as a state, fuel flow rate into the cylinders is also accurately modeled. A linear system identification is performed on the nonlinear model. The linear model structure is predicted analytically from the nonlinear model and the coefficients of the predicted transfer function are shown to be functions of key physical parameters in the plant. Simulations of linear system and model parameter identification is shown to converge to the predicted values of the model coefficients. The multivariable controller developed in this research could be designed in an algebraic fashion once the plant model is known. It is thus possible to implement the multivariable PI design in an adaptive fashion combining the controller with identified plant model on-line. This will result in a self-tuning regulator (STR) type controller where the underlying design criteria is the multivariable tuning technique designed in this research.
Multivariable speed synchronisation for a parallel hybrid electric vehicle drivetrain

NASA Astrophysics Data System (ADS)

Alt, B.; Antritter, F.; Svaricek, F.; Schultalbers, M.

2013-03-01

In this article, a new drivetrain configuration of a parallel hybrid electric vehicle is considered and a novel model-based control design strategy is given. In particular, the control design covers the speed synchronisation task during a restart of the internal combustion engine. The proposed multivariable synchronisation strategy is based on feedforward and decoupled feedback controllers. The performance and the robustness properties of the closed-loop system are illustrated by nonlinear simulation results.
Structural equation models based on multivariate diversity assessment of diploid and tetraploid hulled wheat species

USDA-ARS?s Scientific Manuscript database

Hulled wheats are largely untapped genetic resources with >10,000 years of genetic memory and diversity that can be used for wheat quality improvement, development of healthy products, and adaptation to climate change. Multivariate diversity was assessed in the diploid Triticum monococcum L. var mon...
Generating Nonnormal Multivariate Data Using Copulas: Applications to SEM

ERIC Educational Resources Information Center

Mair, Patrick; Satorra, Albert; Bentler, Peter M.

2012-01-01

This article develops a procedure based on copulas to simulate multivariate nonnormal data that satisfy a prespecified variance-covariance matrix. The covariance matrix used can comply with a specific moment structure form (e.g., a factor analysis or a general structural equation model). Thus, the method is particularly useful for Monte Carlo…
Root Cause Analysis of Quality Defects Using HPLC-MS Fingerprint Knowledgebase for Batch-to-batch Quality Control of Herbal Drugs.

PubMed

Yan, Binjun; Fang, Zhonghua; Shen, Lijuan; Qu, Haibin

2015-01-01

The batch-to-batch quality consistency of herbal drugs has always been an important issue. To propose a methodology for batch-to-batch quality control based on HPLC-MS fingerprints and process knowledgebase. The extraction process of Compound E-jiao Oral Liquid was taken as a case study. After establishing the HPLC-MS fingerprint analysis method, the fingerprints of the extract solutions produced under normal and abnormal operation conditions were obtained. Multivariate statistical models were built for fault detection and a discriminant analysis model was built using the probabilistic discriminant partial-least-squares method for fault diagnosis. Based on multivariate statistical analysis, process knowledge was acquired and the cause-effect relationship between process deviations and quality defects was revealed. The quality defects were detected successfully by multivariate statistical control charts and the type of process deviations were diagnosed correctly by discriminant analysis. This work has demonstrated the benefits of combining HPLC-MS fingerprints, process knowledge and multivariate analysis for the quality control of herbal drugs. Copyright © 2015 John Wiley & Sons, Ltd.
Multivariate Time Series Decomposition into Oscillation Components.

PubMed

Matsuda, Takeru; Komaki, Fumiyasu

2017-08-01

Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Analyzing Multiple Outcomes in Clinical Research Using Multivariate Multilevel Models

PubMed Central

Baldwin, Scott A.; Imel, Zac E.; Braithwaite, Scott R.; Atkins, David C.

2014-01-01

Objective Multilevel models have become a standard data analysis approach in intervention research. Although the vast majority of intervention studies involve multiple outcome measures, few studies use multivariate analysis methods. The authors discuss multivariate extensions to the multilevel model that can be used by psychotherapy researchers. Method and Results Using simulated longitudinal treatment data, the authors show how multivariate models extend common univariate growth models and how the multivariate model can be used to examine multivariate hypotheses involving fixed effects (e.g., does the size of the treatment effect differ across outcomes?) and random effects (e.g., is change in one outcome related to change in the other?). An online supplemental appendix provides annotated computer code and simulated example data for implementing a multivariate model. Conclusions Multivariate multilevel models are flexible, powerful models that can enhance clinical research. PMID:24491071
Multivariate statistical process control (MSPC) using Raman spectroscopy for in-line culture cell monitoring considering time-varying batches synchronized with correlation optimized warping (COW).

PubMed

Liu, Ya-Juan; André, Silvère; Saint Cristau, Lydia; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Devos, Olivier; Duponchel, Ludovic

2017-02-01

Multivariate statistical process control (MSPC) is increasingly popular as the challenge provided by large multivariate datasets from analytical instruments such as Raman spectroscopy for the monitoring of complex cell cultures in the biopharmaceutical industry. However, Raman spectroscopy for in-line monitoring often produces unsynchronized data sets, resulting in time-varying batches. Moreover, unsynchronized data sets are common for cell culture monitoring because spectroscopic measurements are generally recorded in an alternate way, with more than one optical probe parallelly connecting to the same spectrometer. Synchronized batches are prerequisite for the application of multivariate analysis such as multi-way principal component analysis (MPCA) for the MSPC monitoring. Correlation optimized warping (COW) is a popular method for data alignment with satisfactory performance; however, it has never been applied to synchronize acquisition time of spectroscopic datasets in MSPC application before. In this paper we propose, for the first time, to use the method of COW to synchronize batches with varying durations analyzed with Raman spectroscopy. In a second step, we developed MPCA models at different time intervals based on the normal operation condition (NOC) batches synchronized by COW. New batches are finally projected considering the corresponding MPCA model. We monitored the evolution of the batches using two multivariate control charts based on Hotelling's T 2 and Q. As illustrated with results, the MSPC model was able to identify abnormal operation condition including contaminated batches which is of prime importance in cell culture monitoring We proved that Raman-based MSPC monitoring can be used to diagnose batches deviating from the normal condition, with higher efficacy than traditional diagnosis, which would save time and money in the biopharmaceutical industry. Copyright © 2016 Elsevier B.V. All rights reserved.
Prediction of mortality rates using a model with stochastic parameters

NASA Astrophysics Data System (ADS)

Tan, Chon Sern; Pooi, Ah Hin

2016-10-01

Prediction of future mortality rates is crucial to insurance companies because they face longevity risks while providing retirement benefits to a population whose life expectancy is increasing. In the past literature, a time series model based on multivariate power-normal distribution has been applied on mortality data from the United States for the years 1933 till 2000 to forecast the future mortality rates for the years 2001 till 2010. In this paper, a more dynamic approach based on the multivariate time series will be proposed where the model uses stochastic parameters that vary with time. The resulting prediction intervals obtained using the model with stochastic parameters perform better because apart from having good ability in covering the observed future mortality rates, they also tend to have distinctly shorter interval lengths.
Simultaneous calibration of ensemble river flow predictions over an entire range of lead times

NASA Astrophysics Data System (ADS)

Hemri, S.; Fundel, F.; Zappa, M.

2013-10-01

Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.

PubMed

Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A

2010-11-01

The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.

Multivariate Granger causality: an estimation framework based on factorization of the spectral density matrix

PubMed Central

Wen, Xiaotong; Rangarajan, Govindan; Ding, Mingzhou

2013-01-01

Granger causality is increasingly being applied to multi-electrode neurophysiological and functional imaging data to characterize directional interactions between neurons and brain regions. For a multivariate dataset, one might be interested in different subsets of the recorded neurons or brain regions. According to the current estimation framework, for each subset, one conducts a separate autoregressive model fitting process, introducing the potential for unwanted variability and uncertainty. In this paper, we propose a multivariate framework for estimating Granger causality. It is based on spectral density matrix factorization and offers the advantage that the estimation of such a matrix needs to be done only once for the entire multivariate dataset. For any subset of recorded data, Granger causality can be calculated through factorizing the appropriate submatrix of the overall spectral density matrix. PMID:23858479
Sex Hormones and Sleep in Men and Women From the General Population: A Cross-Sectional Observational Study.

PubMed

Kische, Hanna; Ewert, Ralf; Fietze, Ingo; Gross, Stefan; Wallaschofski, Henri; Völzke, Henry; Dörr, Marcus; Nauck, Matthias; Obst, Anne; Stubbe, Beate; Penzel, Thomas; Haring, Robin

2016-11-01

Associations between sex hormones and sleep habits originate mainly from small and selected patient-based samples. We examined data from a population-based sample with various sleep characteristics and the major part of sex hormones measured by mass spectrometry. We used data from 204 men and 213 women of the cross-sectional Study of Health in Pomerania-TREND. Associations of total T (TT) and free T, androstenedione (ASD), estrone, estradiol (E2), dehydroepiandrosterone-sulphate, SHBG, and E2 to TT ratio with sleep measures (including total sleep time, sleep efficiency, wake after sleep onset, apnea-hypopnea index [AHI], Insomnia Severity Index, Epworth Sleepiness Scale, and Pittsburgh Sleep Quality Index) were assessed by sex-specific multivariable regression models. In men, age-adjusted associations of TT (odds ratio 0.62; 95% confidence interval (CI) 0.46-0.83), free T, and SHBG with AHI were rendered nonsignificant after multivariable adjustment. In multivariable analyses, ASD was associated with Epworth Sleepiness Scale (β-coefficient per SD increase in ASD: -0.71; 95% CI: -1.18 to -0.25). In women, multivariable analyses showed positive associations of dehydroepiandrosterone-sulphate with wake after sleep onset (β-coefficient: .16; 95% CI 0.03-0.28) and of E2 and E2 to TT ratio with Epworth Sleepiness Scale. Additionally, free T and SHBG were associated with AHI in multivariable models among premenopausal women. The present cross-sectional, population-based study observed sex-specific associations of androgens, E2, and SHBG with sleep apnea and daytime sleepiness. However, multivariable-adjusted analyses confirmed the impact of body composition and health-related lifestyle on the association between sex hormones and sleep.
Augmented classical least squares multivariate spectral analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2004-02-03

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2005-07-26

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2005-01-11

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Local polynomial estimation of heteroscedasticity in a multivariate linear regression model and its applications in economics.

PubMed

Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan

2012-01-01

Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Multivariate calibration on NIR data: development of a model for the rapid evaluation of ethanol content in bakery products.

PubMed

Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena

2007-11-05

A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
Response Surface Modeling Using Multivariate Orthogonal Functions

NASA Technical Reports Server (NTRS)

Morelli, Eugene A.; DeLoach, Richard

2001-01-01

A nonlinear modeling technique was used to characterize response surfaces for non-dimensional longitudinal aerodynamic force and moment coefficients, based on wind tunnel data from a commercial jet transport model. Data were collected using two experimental procedures - one based on modem design of experiments (MDOE), and one using a classical one factor at a time (OFAT) approach. The nonlinear modeling technique used multivariate orthogonal functions generated from the independent variable data as modeling functions in a least squares context to characterize the response surfaces. Model terms were selected automatically using a prediction error metric. Prediction error bounds computed from the modeling data alone were found to be- a good measure of actual prediction error for prediction points within the inference space. Root-mean-square model fit error and prediction error were less than 4 percent of the mean response value in all cases. Efficacy and prediction performance of the response surface models identified from both MDOE and OFAT experiments were investigated.
Multivariate Phylogenetic Comparative Methods: Evaluations, Comparisons, and Recommendations.

PubMed

Adams, Dean C; Collyer, Michael L

2018-01-01

Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as pairwise composite likelihood methods should be avoided, whereas algebraic generalizations of the phylogenetic comparative toolkit provide a useful means of assessing macroevolutionary patterns in multivariate data. Finally, we discuss areas in which multivariate phylogenetic comparative methods are still in need of future development; namely highly multivariate Ornstein-Uhlenbeck models and approaches for multivariate evolutionary model comparisons. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Confounder summary scores when comparing the effects of multiple drug exposures.

PubMed

Cadarette, Suzanne M; Gagne, Joshua J; Solomon, Daniel H; Katz, Jeffrey N; Stürmer, Til

2010-01-01

Little information is available comparing methods to adjust for confounding when considering multiple drug exposures. We compared three analytic strategies to control for confounding based on measured variables: conventional multivariable, exposure propensity score (EPS), and disease risk score (DRS). Each method was applied to a dataset (2000-2006) recently used to examine the comparative effectiveness of four drugs. The relative effectiveness of risedronate, nasal calcitonin, and raloxifene in preventing non-vertebral fracture, were each compared to alendronate. EPSs were derived both by using multinomial logistic regression (single model EPS) and by three separate logistic regression models (separate model EPS). DRSs were derived and event rates compared using Cox proportional hazard models. DRSs derived among the entire cohort (full cohort DRS) was compared to DRSs derived only among the referent alendronate (unexposed cohort DRS). Less than 8% deviation from the base estimate (conventional multivariable) was observed applying single model EPS, separate model EPS or full cohort DRS. Applying the unexposed cohort DRS when background risk for fracture differed between comparison drug exposure cohorts resulted in -7 to + 13% deviation from our base estimate. With sufficient numbers of exposed and outcomes, either conventional multivariable, EPS or full cohort DRS may be used to adjust for confounding to compare the effects of multiple drug exposures. However, our data also suggest that unexposed cohort DRS may be problematic when background risks differ between referent and exposed groups. Further empirical and simulation studies will help to clarify the generalizability of our findings.
Multivariate model of female black bear habitat use for a Geographic Information System

USGS Publications Warehouse

Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.

1993-01-01

Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.
A mixed model for the relationship between climate and human cranial form.

PubMed

Katz, David C; Grote, Mark N; Weaver, Timothy D

2016-08-01

We expand upon a multivariate mixed model from quantitative genetics in order to estimate the magnitude of climate effects in a global sample of recent human crania. In humans, genetic distances are correlated with distances based on cranial form, suggesting that population structure influences both genetic and quantitative trait variation. Studies controlling for this structure have demonstrated significant underlying associations of cranial distances with ecological distances derived from climate variables. However, to assess the biological importance of an ecological predictor, estimates of effect size and uncertainty in the original units of measurement are clearly preferable to significance claims based on units of distance. Unfortunately, the magnitudes of ecological effects are difficult to obtain with distance-based methods, while models that produce estimates of effect size generally do not scale to high-dimensional data like cranial shape and form. Using recent innovations that extend quantitative genetics mixed models to highly multivariate observations, we estimate morphological effects associated with a climate predictor for a subset of the Howells craniometric dataset. Several measurements, particularly those associated with cranial vault breadth, show a substantial linear association with climate, and the multivariate model incorporating a climate predictor is preferred in model comparison. Previous studies demonstrated the existence of a relationship between climate and cranial form. The mixed model quantifies this relationship concretely. Evolutionary questions that require population structure and phylogeny to be disentangled from potential drivers of selection may be particularly well addressed by mixed models. Am J Phys Anthropol 160:593-603, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Multivariable model predictive control design of reactive distillation column for Dimethyl Ether production

NASA Astrophysics Data System (ADS)

Wahid, A.; Putra, I. G. E. P.

2018-03-01

Dimethyl ether (DME) as an alternative clean energy has attracted a growing attention in the recent years. DME production via reactive distillation has potential for capital cost and energy requirement savings. However, combination of reaction and distillation on a single column makes reactive distillation process a very complex multivariable system with high non-linearity of process and strong interaction between process variables. This study investigates a multivariable model predictive control (MPC) based on two-point temperature control strategy for the DME reactive distillation column to maintain the purities of both product streams. The process model is estimated by a first order plus dead time model. The DME and water purity is maintained by controlling a stage temperature in rectifying and stripping section, respectively. The result shows that the model predictive controller performed faster responses compared to conventional PI controller that are showed by the smaller ISE values. In addition, the MPC controller is able to handle the loop interactions well.
Self-tuning multivariable pole placement control of a multizone crystal growth furnace

NASA Technical Reports Server (NTRS)

Batur, C.; Sharpless, R. B.; Duval, W. M. B.; Rosenthal, B. N.

1992-01-01

This paper presents the design and implementation of a multivariable self-tuning temperature controller for the control of lead bromide crystal growth. The crystal grows inside a multizone transparent furnace. There are eight interacting heating zones shaping the axial temperature distribution inside the furnace. A multi-input, multi-output furnace model is identified on-line by a recursive least squares estimation algorithm. A multivariable pole placement controller based on this model is derived and implemented. Comparison between single-input, single-output and multi-input, multi-output self-tuning controllers demonstrates that the zone-to-zone interactions can be minimized better by a multi-input, multi-output controller design. This directly affects the quality of crystal grown.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities

PubMed Central

Anderson, Marti J; Santana-Garcon, Julia

2015-01-01

Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826
Selective sensing of vapors of similar dielectric constants using peptide-capped gold nanoparticles on individual multivariable transducers.

PubMed

Nagraj, Nandini; Slocik, Joseph M; Phillips, David M; Kelley-Loughnane, Nancy; Naik, Rajesh R; Potyrailo, Radislav A

2013-08-07

Peptide-capped AYSSGAPPMPPF gold nanoparticles were demonstrated for highly selective chemical vapor sensing using individual multivariable inductor-capacitor-resistor (LCR) resonators. Their multivariable response was achieved by measuring their resonance impedance spectra followed by multivariate spectral analysis. Detection of model toxic vapors and chemical agent simulants, such as acetonitrile, dichloromethane and methyl salicylate, was performed. Dichloromethane (dielectric constant εr = 9.1) and methyl salicylate (εr = 9.0) were discriminated using a single sensor. These sensing materials coupled to multivariable transducers can provide numerous opportunities for tailoring the vapor response selectivity based on the diversity of the amino acid composition of the peptides, and by the modulation of the nature of peptide-nanoparticle interactions through designed combinations of hydrophobic and hydrophilic amino acids.
Exploring the Structure of Library and Information Science Web Space Based on Multivariate Analysis of Social Tags

ERIC Educational Resources Information Center

Joo, Soohyung; Kipp, Margaret E. I.

2015-01-01

Introduction: This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tripartite graphs, pattern tracing and descriptive statistics. This…
Applications of modern statistical methods to analysis of data in physical science

NASA Astrophysics Data System (ADS)

Wicker, James Eric

Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
Finding the multipath propagation of multivariable crude oil prices using a wavelet-based network approach

NASA Astrophysics Data System (ADS)

Jia, Xiaoliang; An, Haizhong; Sun, Xiaoqi; Huang, Xuan; Gao, Xiangyun

2016-04-01

The globalization and regionalization of crude oil trade inevitably give rise to the difference of crude oil prices. The understanding of the pattern of the crude oil prices' mutual propagation is essential for analyzing the development of global oil trade. Previous research has focused mainly on the fuzzy long- or short-term one-to-one propagation of bivariate oil prices, generally ignoring various patterns of periodical multivariate propagation. This study presents a wavelet-based network approach to help uncover the multipath propagation of multivariable crude oil prices in a joint time-frequency period. The weekly oil spot prices of the OPEC member states from June 1999 to March 2011 are adopted as the sample data. First, we used wavelet analysis to find different subseries based on an optimal decomposing scale to describe the periodical feature of the original oil price time series. Second, a complex network model was constructed based on an optimal threshold selection to describe the structural feature of multivariable oil prices. Third, Bayesian network analysis (BNA) was conducted to find the probability causal relationship based on periodical structural features to describe the various patterns of periodical multivariable propagation. Finally, the significance of the leading and intermediary oil prices is discussed. These findings are beneficial for the implementation of periodical target-oriented pricing policies and investment strategies.
Divergences and estimating tight bounds on Bayes error with applications to multivariate Gaussian copula and latent Gaussian copula

NASA Astrophysics Data System (ADS)

Thelen, Brian J.; Xique, Ismael J.; Burns, Joseph W.; Goley, G. Steven; Nolan, Adam R.; Benson, Jonathan W.

2017-04-01

In Bayesian decision theory, there has been a great amount of research into theoretical frameworks and information- theoretic quantities that can be used to provide lower and upper bounds for the Bayes error. These include well-known bounds such as Chernoff, Battacharrya, and J-divergence. Part of the challenge of utilizing these various metrics in practice is (i) whether they are "loose" or "tight" bounds, (ii) how they might be estimated via either parametric or non-parametric methods, and (iii) how accurate the estimates are for limited amounts of data. In general what is desired is a methodology for generating relatively tight lower and upper bounds, and then an approach to estimate these bounds efficiently from data. In this paper, we explore the so-called triangle divergence which has been around for a while, but was recently made more prominent in some recent research on non-parametric estimation of information metrics. Part of this work is motivated by applications for quantifying fundamental information content in SAR/LIDAR data, and to help in this, we have developed a flexible multivariate modeling framework based on multivariate Gaussian copula models which can be combined with the triangle divergence framework to quantify this information, and provide approximate bounds on Bayes error. In this paper we present an overview of the bounds, including those based on triangle divergence and verify that under a number of multivariate models, the upper and lower bounds derived from triangle divergence are significantly tighter than the other common bounds, and often times, dramatically so. We also propose some simple but effective means for computing the triangle divergence using Monte Carlo methods, and then discuss estimation of the triangle divergence from empirical data based on Gaussian Copula models.

Factorial Design Based Multivariate Modeling and Optimization of Tunable Bioresponsive Arginine Grafted Poly(cystaminebis(acrylamide)-diaminohexane) Polymeric Matrix Based Nanocarriers.

PubMed

Yang, Rongbing; Nam, Kihoon; Kim, Sung Wan; Turkson, James; Zou, Ye; Zuo, Yi Y; Haware, Rahul V; Chougule, Mahavir B

2017-01-03

Desired characteristics of nanocarriers are crucial to explore its therapeutic potential. This investigation aimed to develop tunable bioresponsive newly synthesized unique arginine grafted poly(cystaminebis(acrylamide)-diaminohexane) [ABP] polymeric matrix based nanocarriers by using L9 Taguchi factorial design, desirability function, and multivariate method. The selected formulation and process parameters were ABP concentration, acetone concentration, the volume ratio of acetone to ABP solution, and drug concentration. The measured nanocarrier characteristics were particle size, polydispersity index, zeta potential, and percentage drug loading. Experimental validation of nanocarrier characteristics computed from initially developed predictive model showed nonsignificant differences (p > 0.05). The multivariate modeling based optimized cationic nanocarrier formulation of <100 nm loaded with hydrophilic acetaminophen was readapted for a hydrophobic etoposide loading without significant changes (p > 0.05) except for improved loading percentage. This is the first study focusing on ABP polymeric matrix based nanocarrier development. Nanocarrier particle size was stable in PBS 7.4 for 48 h. The increase of zeta potential at lower pH 6.4, compared to the physiological pH, showed possible endosomal escape capability. The glutathione triggered release at the physiological conditions indicated the competence of cytosolic targeting delivery of the loaded drug from bioresponsive nanocarriers. In conclusion, this unique systematic approach provides rational evaluation and prediction of a tunable bioresponsive ABP based matrix nanocarrier, which was built on selected limited number of smart experimentation.
Generating Nonnormal Multivariate Data Using Copulas: Applications to SEM.

PubMed

Mair, Patrick; Satorra, Albert; Bentler, Peter M

2012-07-01

This article develops a procedure based on copulas to simulate multivariate nonnormal data that satisfy a prespecified variance-covariance matrix. The covariance matrix used can comply with a specific moment structure form (e.g., a factor analysis or a general structural equation model). Thus, the method is particularly useful for Monte Carlo evaluation of structural equation models within the context of nonnormal data. The new procedure for nonnormal data simulation is theoretically described and also implemented in the widely used R environment. The quality of the method is assessed by Monte Carlo simulations. A 1-sample test on the observed covariance matrix based on the copula methodology is proposed. This new test for evaluating the quality of a simulation is defined through a particular structural model specification and is robust against normality violations.
A methodology for computing uncertainty bounds of multivariable systems based on sector stability theory concepts

NASA Technical Reports Server (NTRS)

Waszak, Martin R.

1992-01-01

The application of a sector-based stability theory approach to the formulation of useful uncertainty descriptions for linear, time-invariant, multivariable systems is explored. A review of basic sector properties and sector-based approach are presented first. The sector-based approach is then applied to several general forms of parameter uncertainty to investigate its advantages and limitations. The results indicate that the sector uncertainty bound can be used effectively to evaluate the impact of parameter uncertainties on the frequency response of the design model. Inherent conservatism is a potential limitation of the sector-based approach, especially for highly dependent uncertain parameters. In addition, the representation of the system dynamics can affect the amount of conservatism reflected in the sector bound. Careful application of the model can help to reduce this conservatism, however, and the solution approach has some degrees of freedom that may be further exploited to reduce the conservatism.
Copula-based prediction of economic movements

NASA Astrophysics Data System (ADS)

García, J. E.; González-López, V. A.; Hirsh, I. D.

2016-06-01

In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.
Probabilistic flood damage modelling at the meso-scale

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

2014-05-01

Decisions on flood risk management and adaptation are usually based on risk analyses. Such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments. Most damage models have in common that complex damaging processes are described by simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood damage models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we show how the model BT-FLEMO (Bagging decision Tree based Flood Loss Estimation MOdel) can be applied on the meso-scale, namely on the basis of ATKIS land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany. The application of BT-FLEMO provides a probability distribution of estimated damage to residential buildings per municipality. Validation is undertaken on the one hand via a comparison with eight other damage models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official damage data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of damage estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation model BT-FLEMO is that it inherently provides quantitative information about the uncertainty of the prediction. Reference: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64.
Preliminary Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Prince, F. Andrew; Smart, Christian; Stephens, Kyle; Henrichs, Todd

2009-01-01

Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. However, great care is required. Some space telescope cost models, such as those based only on mass, lack sufficient detail to support such analysis and may lead to inaccurate conclusions. Similarly, using ground based telescope models which include the dome cost will also lead to inaccurate conclusions. This paper reviews current and historical models. Then, based on data from 22 different NASA space telescopes, this paper tests those models and presents preliminary analysis of single and multi-variable space telescope cost models.
Inferring Instantaneous, Multivariate and Nonlinear Sensitivities for the Analysis of Feedback Processes in a Dynamical System: Lorenz Model Case Study

NASA Technical Reports Server (NTRS)

Aires, Filipe; Rossow, William B.; Hansen, James E. (Technical Monitor)

2001-01-01

A new approach is presented for the analysis of feedback processes in a nonlinear dynamical system by observing its variations. The new methodology consists of statistical estimates of the sensitivities between all pairs of variables in the system based on a neural network modeling of the dynamical system. The model can then be used to estimate the instantaneous, multivariate and nonlinear sensitivities, which are shown to be essential for the analysis of the feedbacks processes involved in the dynamical system. The method is described and tested on synthetic data from the low-order Lorenz circulation model where the correct sensitivities can be evaluated analytically.
General Multivariate Linear Modeling of Surface Shapes Using SurfStat

PubMed Central

Chung, Moo K.; Worsley, Keith J.; Nacewicz, Brendon, M.; Dalton, Kim M.; Davidson, Richard J.

2010-01-01

Although there are many imaging studies on traditional ROI-based amygdala volumetry, there are very few studies on modeling amygdala shape variations. This paper present a unified computational and statistical framework for modeling amygdala shape variations in a clinical population. The weighted spherical harmonic representation is used as to parameterize, to smooth out, and to normalize amygdala surfaces. The representation is subsequently used as an input for multivariate linear models accounting for nuisance covariates such as age and brain size difference using SurfStat package that completely avoids the complexity of specifying design matrices. The methodology has been applied for quantifying abnormal local amygdala shape variations in 22 high functioning autistic subjects. PMID:20620211
Methodological challenges to multivariate syndromic surveillance: a case study using Swiss animal health data.

PubMed

Vial, Flavie; Wei, Wei; Held, Leonhard

2016-12-20

In an era of ubiquitous electronic collection of animal health data, multivariate surveillance systems (which concurrently monitor several data streams) should have a greater probability of detecting disease events than univariate systems. However, despite their limitations, univariate aberration detection algorithms are used in most active syndromic surveillance (SyS) systems because of their ease of application and interpretation. On the other hand, a stochastic modelling-based approach to multivariate surveillance offers more flexibility, allowing for the retention of historical outbreaks, for overdispersion and for non-stationarity. While such methods are not new, they are yet to be applied to animal health surveillance data. We applied an example of such stochastic model, Held and colleagues' two-component model, to two multivariate animal health datasets from Switzerland. In our first application, multivariate time series of the number of laboratories test requests were derived from Swiss animal diagnostic laboratories. We compare the performance of the two-component model to parallel monitoring using an improved Farrington algorithm and found both methods yield a satisfactorily low false alarm rate. However, the calibration test of the two-component model on the one-step ahead predictions proved satisfactory, making such an approach suitable for outbreak prediction. In our second application, the two-component model was applied to the multivariate time series of the number of cattle abortions and the number of test requests for bovine viral diarrhea (a disease that often results in abortions). We found that there is a two days lagged effect from the number of abortions to the number of test requests. We further compared the joint modelling and univariate modelling of the number of laboratory test requests time series. The joint modelling approach showed evidence of superiority in terms of forecasting abilities. Stochastic modelling approaches offer the potential to address more realistic surveillance scenarios through, for example, the inclusion of times series specific parameters, or of covariates known to have an impact on syndrome counts. Nevertheless, many methodological challenges to multivariate surveillance of animal SyS data still remain. Deciding on the amount of corroboration among data streams that is required to escalate into an alert is not a trivial task given the sparse data on the events under consideration (e.g. disease outbreaks).
Bayesian multivariate hierarchical transformation models for ROC analysis.

PubMed

O'Malley, A James; Zou, Kelly H

2006-02-15

A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis

PubMed Central

O'Malley, A. James; Zou, Kelly H.

2006-01-01

SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
A multivariate cure model for left-censored and right-censored data with application to colorectal cancer screening patterns.

PubMed

Hagar, Yolanda C; Harvey, Danielle J; Beckett, Laurel A

2016-08-30

We develop a multivariate cure survival model to estimate lifetime patterns of colorectal cancer screening. Screening data cover long periods of time, with sparse observations for each person. Some events may occur before the study begins or after the study ends, so the data are both left-censored and right-censored, and some individuals are never screened (the 'cured' population). We propose a multivariate parametric cure model that can be used with left-censored and right-censored data. Our model allows for the estimation of the time to screening as well as the average number of times individuals will be screened. We calculate likelihood functions based on the observations for each subject using a distribution that accounts for within-subject correlation and estimate parameters using Markov chain Monte Carlo methods. We apply our methods to the estimation of lifetime colorectal cancer screening behavior in the SEER-Medicare data set. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data.

PubMed

Levine, Matthew E; Albers, David J; Hripcsak, George

2016-01-01

Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
Probabilistic, meso-scale flood loss modelling

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

2016-04-01

Flood risk analyses are an important basis for decisions on flood risk management and adaptation. However, such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments and even more for flood loss modelling. State of the art in flood loss modelling is still the use of simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood loss models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we demonstrate and evaluate the upscaling of the approach to the meso-scale, namely on the basis of land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany (Botto et al. submitted). The application of bagging decision tree based loss models provide a probability distribution of estimated loss per municipality. Validation is undertaken on the one hand via a comparison with eight deterministic loss models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official loss data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of loss estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation approach is that it inherently provides quantitative information about the uncertainty of the prediction. References: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64. Botto A, Kreibich H, Merz B, Schröter K (submitted) Probabilistic, multi-variable flood loss modelling on the meso-scale with BT-FLEMO. Risk Analysis.
Gain-scheduling multivariable LPV control of an irrigation canal system.

PubMed

Bolea, Yolanda; Puig, Vicenç

2016-07-01

The purpose of this paper is to present a multivariable linear parameter varying (LPV) controller with a gain scheduling Smith Predictor (SP) scheme applicable to open-flow canal systems. This LPV controller based on SP is designed taking into account the uncertainty in the estimation of delay and the variation of plant parameters according to the operating point. This new methodology can be applied to a class of delay systems that can be represented by a set of models that can be factorized into a rational multivariable model in series with left/right diagonal (multiple) delays, such as, the case of irrigation canals. A multiple pool canal system is used to test and validate the proposed control approach. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Modeling and Simulation of Upset-Inducing Disturbances for Digital Systems in an Electromagnetic Reverberation Chamber

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo

2014-01-01

This report describes a modeling and simulation approach for disturbance patterns representative of the environment experienced by a digital system in an electromagnetic reverberation chamber. The disturbance is modeled by a multi-variate statistical distribution based on empirical observations. Extended versions of the Rejection Samping and Inverse Transform Sampling techniques are developed to generate multi-variate random samples of the disturbance. The results show that Inverse Transform Sampling returns samples with higher fidelity relative to the empirical distribution. This work is part of an ongoing effort to develop a resilience assessment methodology for complex safety-critical distributed systems.
The role of area-level deprivation and gender in participation in population-based faecal immunochemical test (FIT) colorectal cancer screening.

PubMed

Clarke, Nicholas; McNamara, Deirdre; Kearney, Patricia M; O'Morain, Colm A; Shearer, Nikki; Sharp, Linda

2016-12-01

This study aimed to investigate the effects of sex and deprivation on participation in a population-based faecal immunochemical test (FIT) colorectal cancer screening programme. The study population included 9785 individuals invited to participate in two rounds of a population-based biennial FIT-based screening programme, in a relatively deprived area of Dublin, Ireland. Explanatory variables included in the analysis were sex, deprivation category of area of residence and age (at end of screening). The primary outcome variable modelled was participation status in both rounds combined (with "participation" defined as having taken part in either or both rounds of screening). Poisson regression with a log link and robust error variance was used to estimate relative risks (RR) for participation. As a sensitivity analysis, data were stratified by screening round. In both the univariable and multivariable models deprivation was strongly associated with participation. Increasing affluence was associated with higher participation; participation was 26% higher in people resident in the most affluent compared to the most deprived areas (multivariable RR=1.26: 95% CI 1.21-1.30). Participation was significantly lower in males (multivariable RR=0.96: 95%CI 0.95-0.97) and generally increased with increasing age (trend per age group, multivariable RR=1.02: 95%CI, 1.01-1.02). No significant interactions between the explanatory variables were found. The effects of deprivation and sex were similar by screening round. Deprivation and male gender are independently associated with lower uptake of population-based FIT colorectal cancer screening, even in a relatively deprived setting. Development of evidence-based interventions to increase uptake in these disadvantaged groups is urgently required. Copyright © 2016. Published by Elsevier Inc.
Multivariate Strategies in Functional Magnetic Resonance Imaging

ERIC Educational Resources Information Center

Hansen, Lars Kai

2007-01-01

We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.
Investigating College and Graduate Students' Multivariable Reasoning in Computational Modeling

ERIC Educational Resources Information Center

Wu, Hsin-Kai; Wu, Pai-Hsing; Zhang, Wen-Xin; Hsu, Ying-Shao

2013-01-01

Drawing upon the literature in computational modeling, multivariable reasoning, and causal attribution, this study aims at characterizing multivariable reasoning practices in computational modeling and revealing the nature of understanding about multivariable causality. We recruited two freshmen, two sophomores, two juniors, two seniors, four…
A Multivariate Model for the Study of Parental Acceptance-Rejection and Child Abuse.

ERIC Educational Resources Information Center

Rohner, Ronald P.; Rohner, Evelyn C.

This paper proposes a multivariate strategy for the study of parental acceptance-rejection and child abuse and describes a research study on parental rejection and child abuse which illustrates the advantages of using a multivariate, (rather than a simple-model) approach. The multivariate model is a combination of three simple models used to study…

Speciation of adsorbates on surface of solids by infrared spectroscopy and chemometrics.

PubMed

Vilmin, Franck; Bazin, Philippe; Thibault-Starzyk, Frédéric; Travert, Arnaud

2015-09-03

Speciation, i.e. identification and quantification, of surface species on heterogeneous surfaces by infrared spectroscopy is important in many fields but remains a challenging task when facing strongly overlapped spectra of multiple adspecies. Here, we propose a new methodology, combining state of the art instrumental developments for quantitative infrared spectroscopy of adspecies and chemometrics tools, mainly a novel data processing algorithm, called SORB-MCR (SOft modeling by Recursive Based-Multivariate Curve Resolution) and multivariate calibration. After formal transposition of the general linear mixture model to adsorption spectral data, the main issues, i.e. validity of Beer-Lambert law and rank deficiency problems, are theoretically discussed. Then, the methodology is exposed through application to two case studies, each of them characterized by a specific type of rank deficiency: (i) speciation of physisorbed water species over a hydrated silica surface, and (ii) speciation (chemisorption and physisorption) of a silane probe molecule over a dehydrated silica surface. In both cases, we demonstrate the relevance of this approach which leads to a thorough surface speciation based on comprehensive and fully interpretable multivariate quantitative models. Limitations and drawbacks of the methodology are also underlined. Copyright © 2015 Elsevier B.V. All rights reserved.
Web-based tools for modelling and analysis of multivariate data: California ozone pollution activity

PubMed Central

Dinov, Ivo D.; Christou, Nicolas

2014-01-01

This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting and statistical inference on these data are presented. All components of this case study (data, tools, activity) are freely available online at: http://wiki.stat.ucla.edu/socr/index.php/SOCR_MotionCharts_CAOzoneData. Several types of exploratory (motion charts, box-and-whisker plots, spider charts) and quantitative (inference, regression, analysis of variance (ANOVA)) data analyses tools are demonstrated. Two specific human health related questions (temporal and geographic effects of ozone pollution) are discussed as motivational challenges. PMID:24465054
Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

NASA Astrophysics Data System (ADS)

Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

2017-06-01

The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.
Web-based tools for modelling and analysis of multivariate data: California ozone pollution activity.

PubMed

Dinov, Ivo D; Christou, Nicolas

2011-09-01

This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting and statistical inference on these data are presented. All components of this case study (data, tools, activity) are freely available online at: http://wiki.stat.ucla.edu/socr/index.php/SOCR_MotionCharts_CAOzoneData. Several types of exploratory (motion charts, box-and-whisker plots, spider charts) and quantitative (inference, regression, analysis of variance (ANOVA)) data analyses tools are demonstrated. Two specific human health related questions (temporal and geographic effects of ozone pollution) are discussed as motivational challenges.
The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

PubMed

Warton, David I; Thibaut, Loïc; Wang, Yi Alice

2017-01-01

Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
The PIT-trap—A “model-free” bootstrap procedure for inference about regression models with discrete, multivariate responses

PubMed Central

Thibaut, Loïc; Wang, Yi Alice

2017-01-01

Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
What matters? Assessing and developing inquiry and multivariable reasoning skills in high school chemistry

NASA Astrophysics Data System (ADS)

Daftedar Abdelhadi, Raghda Mohamed

Although the Next Generation Science Standards (NGSS) present a detailed set of Science and Engineering Practices, a finer grained representation of the underlying skills is lacking in the standards document. Therefore, it has been reported that teachers are facing challenges deciphering and effectively implementing the standards, especially with regards to the Practices. This analytical study assessed the development of high school chemistry students' (N = 41) inquiry, multivariable causal reasoning skills, and metacognition as a mediator for their development. Inquiry tasks based on concepts of element properties of the periodic table as well as reaction kinetics required students to conduct controlled thought experiments, make inferences, and declare predictions of the level of the outcome variable by coordinating the effects of multiple variables. An embedded mixed methods design was utilized for depth and breadth of understanding. Various sources of data were collected including students' written artifacts, audio recordings of in-depth observational groups and interviews. Data analysis was informed by a conceptual framework formulated around the concepts of coordinating theory and evidence, metacognition, and mental models of multivariable causal reasoning. Results of the study indicated positive change towards conducting controlled experimentation, making valid inferences and justifications. Additionally, significant positive correlation between metastrategic and metacognitive competencies, and sophistication of experimental strategies, signified the central role metacognition played. Finally, lack of consistency in indicating effective variables during the multivariable prediction task pointed towards the fragile mental models of multivariable causal reasoning the students had. Implications for teacher education, science education policy as well as classroom research methods are discussed. Finally, recommendations for developing reform-based chemistry curricula based on the Practices are presented.
Lateralization of temporal lobe epilepsy by multimodal multinomial hippocampal response-driven models.

PubMed

Nazem-Zadeh, Mohammad-Reza; Elisevich, Kost V; Schwalb, Jason M; Bagher-Ebadian, Hassan; Mahmoudi, Fariborz; Soltanian-Zadeh, Hamid

2014-12-15

Multiple modalities are used in determining laterality in mesial temporal lobe epilepsy (mTLE). It is unclear how much different imaging modalities should be weighted in decision-making. The purpose of this study is to develop response-driven multimodal multinomial models for lateralization of epileptogenicity in mTLE patients based upon imaging features in order to maximize the accuracy of noninvasive studies. The volumes, means and standard deviations of FLAIR intensity and means of normalized ictal-interictal SPECT intensity of the left and right hippocampi were extracted from preoperative images of a retrospective cohort of 45 mTLE patients with Engel class I surgical outcomes, as well as images of a cohort of 20 control, nonepileptic subjects. Using multinomial logistic function regression, the parameters of various univariate and multivariate models were estimated. Based on the Bayesian model averaging (BMA) theorem, response models were developed as compositions of independent univariate models. A BMA model composed of posterior probabilities of univariate response models of hippocampal volumes, means and standard deviations of FLAIR intensity, and means of SPECT intensity with the estimated weighting coefficients of 0.28, 0.32, 0.09, and 0.31, respectively, as well as a multivariate response model incorporating all mentioned attributes, demonstrated complete reliability by achieving a probability of detection of one with no false alarms to establish proper laterality in all mTLE patients. The proposed multinomial multivariate response-driven model provides a reliable lateralization of mesial temporal epileptogenicity including those patients who require phase II assessment. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariable normal-tissue complication modeling of acute esophageal toxicity in advanced stage non-small cell lung cancer patients treated with intensity-modulated (chemo-)radiotherapy.

PubMed

Wijsman, Robin; Dankers, Frank; Troost, Esther G C; Hoffmann, Aswin L; van der Heijden, Erik H F M; de Geus-Oei, Lioe-Fee; Bussink, Johan

2015-10-01

The majority of normal-tissue complication probability (NTCP) models for acute esophageal toxicity (AET) in advanced stage non-small cell lung cancer (AS-NSCLC) patients treated with (chemo-)radiotherapy are based on three-dimensional conformal radiotherapy (3D-CRT). Due to distinct dosimetric characteristics of intensity-modulated radiation therapy (IMRT), 3D-CRT based models need revision. We established a multivariable NTCP model for AET in 149 AS-NSCLC patients undergoing IMRT. An established model selection procedure was used to develop an NTCP model for Grade ⩾2 AET (53 patients) including clinical and esophageal dose-volume histogram parameters. The NTCP model predicted an increased risk of Grade ⩾2 AET in case of: concurrent chemoradiotherapy (CCR) [adjusted odds ratio (OR) 14.08, 95% confidence interval (CI) 4.70-42.19; p<0.001], increasing mean esophageal dose [Dmean; OR 1.12 per Gy increase, 95% CI 1.06-1.19; p<0.001], female patients (OR 3.33, 95% CI 1.36-8.17; p=0.008), and ⩾cT3 (OR 2.7, 95% CI 1.12-6.50; p=0.026). The AUC was 0.82 and the model showed good calibration. A multivariable NTCP model including CCR, Dmean, clinical tumor stage and gender predicts Grade ⩾2 AET after IMRT for AS-NSCLC. Prior to clinical introduction, the model needs validation in an independent patient cohort. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Association of educational status with cardiovascular disease: Teheran Lipid and Glucose Study.

PubMed

Hajsheikholeslami, Farhad; Hatami, Masumeh; Hadaegh, Farzad; Ghanbarian, Arash; Azizi, Fereidoun

2011-06-01

The aim of this study was to evaluate the associations between educational level and cardiovascular disease (CVD) in an older Iranian population. To estimate the odds ratio (OR) of educational level in a cross-sectional study, logistic regression analysis was used on 1,788 men and 2,204 women (222 men and 204 women positive based on their CVD status) aged ≥ 45 years. In men, educational levels of college degree and literacy level below diploma were inversely associated with CVD in the multivariate model [0.52 (0.28-0.94), 0.61 (0.40-0.92), respectively], but diploma level did not show any significant association with CVD, neither in the crude model nor in the multivariate model. In women, increase in educational level was inversely associated with risk of CVD in the crude model, but in the multivariate adjusted model, literacy level below diploma decreased risk of CVD by 39%, compared with illiteracy. Our findings support those of developed countries that, along with other CVD risk factors, educational status has an inverse association with CVD among a representative Iranian population of older men and women.
Multivariate longitudinal data analysis with censored and intermittent missing responses.

PubMed

Lin, Tsung-I; Lachos, Victor H; Wang, Wan-Lun

2018-05-08

The multivariate linear mixed model (MLMM) has emerged as an important analytical tool for longitudinal data with multiple outcomes. However, the analysis of multivariate longitudinal data could be complicated by the presence of censored measurements because of a detection limit of the assay in combination with unavoidable missing values arising when subjects miss some of their scheduled visits intermittently. This paper presents a generalization of the MLMM approach, called the MLMM-CM, for a joint analysis of the multivariate longitudinal data with censored and intermittent missing responses. A computationally feasible expectation maximization-based procedure is developed to carry out maximum likelihood estimation within the MLMM-CM framework. Moreover, the asymptotic standard errors of fixed effects are explicitly obtained via the information-based method. We illustrate our methodology by using simulated data and a case study from an AIDS clinical trial. Experimental results reveal that the proposed method is able to provide more satisfactory performance as compared with the traditional MLMM approach. Copyright © 2018 John Wiley & Sons, Ltd.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities.

PubMed

Anderson, Marti J; Santana-Garcon, Julia

2015-01-01

Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.
Ground-Based Telescope Parametric Cost Model

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Rowell, Ginger Holmes

2004-01-01

A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis, The model includes both engineering and performance parameters. While diameter continues to be the dominant cost driver, other significant factors include primary mirror radius of curvature and diffraction limited wavelength. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e.. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter are derived. This analysis indicates that recent mirror technology advances have indeed reduced the historical telescope cost curve.
A model-based approach to wildland fire reconstruction using sediment charcoal records

USGS Publications Warehouse

Itter, Malcolm S.; Finley, Andrew O.; Hooten, Mevin B.; Higuera, Philip E.; Marlon, Jennifer R.; Kelly, Ryan; McLachlan, Jason S.

2017-01-01

Lake sediment charcoal records are used in paleoecological analyses to reconstruct fire history, including the identification of past wildland fires. One challenge of applying sediment charcoal records to infer fire history is the separation of charcoal associated with local fire occurrence and charcoal originating from regional fire activity. Despite a variety of methods to identify local fires from sediment charcoal records, an integrated statistical framework for fire reconstruction is lacking. We develop a Bayesian point process model to estimate the probability of fire associated with charcoal counts from individual-lake sediments and estimate mean fire return intervals. A multivariate extension of the model combines records from multiple lakes to reduce uncertainty in local fire identification and estimate a regional mean fire return interval. The univariate and multivariate models are applied to 13 lakes in the Yukon Flats region of Alaska. Both models resulted in similar mean fire return intervals (100–350 years) with reduced uncertainty under the multivariate model due to improved estimation of regional charcoal deposition. The point process model offers an integrated statistical framework for paleofire reconstruction and extends existing methods to infer regional fire history from multiple lake records with uncertainty following directly from posterior distributions.
Multivariate analysis in thoracic research.

PubMed

Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego

2015-03-01

Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
Family-Based Rare Variant Association Analysis: A Fast and Efficient Method of Multivariate Phenotype Association Analysis.

PubMed

Wang, Longfei; Lee, Sungyoung; Gim, Jungsoo; Qiao, Dandi; Cho, Michael; Elston, Robert C; Silverman, Edwin K; Won, Sungho

2016-09-01

Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows. © 2016 WILEY PERIODICALS, INC.
Extensions to Multivariate Space Time Mixture Modeling of Small Area Cancer Data.

PubMed

Carroll, Rachel; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Aregay, Mehreteab; Watjou, Kevin

2017-05-09

Oral cavity and pharynx cancer, even when considered together, is a fairly rare disease. Implementation of multivariate modeling with lung and bronchus cancer, as well as melanoma cancer of the skin, could lead to better inference for oral cavity and pharynx cancer. The multivariate structure of these models is accomplished via the use of shared random effects, as well as other multivariate prior distributions. The results in this paper indicate that care should be taken when executing these types of models, and that multivariate mixture models may not always be the ideal option, depending on the data of interest.
Southeast Atlantic Cloud Properties in a Multivariate Statistical Model - How Relevant is Air Mass History for Local Cloud Properties?

NASA Astrophysics Data System (ADS)

Fuchs, Julia; Cermak, Jan; Andersen, Hendrik

2017-04-01

This study aims at untangling the impacts of external dynamics and local conditions on cloud properties in the Southeast Atlantic (SEA) by combining satellite and reanalysis data using multivariate statistics. The understanding of clouds and their determinants at different scales is important for constraining the Earth's radiative budget, and thus prominent in climate-system research. In this study, SEA stratocumulus cloud properties are observed not only as the result of local environmental conditions but also as affected by external dynamics and spatial origins of air masses entering the study area. In order to assess to what extent cloud properties are impacted by aerosol concentration, air mass history, and meteorology, a multivariate approach is conducted using satellite observations of aerosol and cloud properties (MODIS, SEVIRI), information on aerosol species composition (MACC) and meteorological context (ERA-Interim reanalysis). To account for the often-neglected but important role of air mass origin, information on air mass history based on HYSPLIT modeling is included in the statistical model. This multivariate approach is intended to lead to a better understanding of the physical processes behind observed stratocumulus cloud properties in the SEA.
Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

NASA Astrophysics Data System (ADS)

Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

2018-05-01

Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.
BN-FLEMOps pluvial - A probabilistic multi-variable loss estimation model for pluvial floods

NASA Astrophysics Data System (ADS)

Roezer, V.; Kreibich, H.; Schroeter, K.; Doss-Gollin, J.; Lall, U.; Merz, B.

2017-12-01

Pluvial flood events, such as in Copenhagen (Denmark) in 2011, Beijing (China) in 2012 or Houston (USA) in 2016, have caused severe losses to urban dwellings in recent years. These floods are caused by storm events with high rainfall rates well above the design levels of urban drainage systems, which lead to inundation of streets and buildings. A projected increase in frequency and intensity of heavy rainfall events in many areas and an ongoing urbanization may increase pluvial flood losses in the future. For an efficient risk assessment and adaptation to pluvial floods, a quantification of the flood risk is needed. Few loss models have been developed particularly for pluvial floods. These models usually use simple waterlevel- or rainfall-loss functions and come with very high uncertainties. To account for these uncertainties and improve the loss estimation, we present a probabilistic multi-variable loss estimation model for pluvial floods based on empirical data. The model was developed in a two-step process using a machine learning approach and a comprehensive database comprising 783 records of direct building and content damage of private households. The data was gathered through surveys after four different pluvial flood events in Germany between 2005 and 2014. In a first step, linear and non-linear machine learning algorithms, such as tree-based and penalized regression models were used to identify the most important loss influencing factors among a set of 55 candidate variables. These variables comprise hydrological and hydraulic aspects, early warning, precaution, building characteristics and the socio-economic status of the household. In a second step, the most important loss influencing variables were used to derive a probabilistic multi-variable pluvial flood loss estimation model based on Bayesian Networks. Two different networks were tested: a score-based network learned from the data and a network based on expert knowledge. Loss predictions are made through Bayesian inference using Markov chain Monte Carlo (MCMC) sampling. With the ability to cope with incomplete information and use expert knowledge, as well as inherently providing quantitative uncertainty information, it is shown that loss models based on BNs are superior to deterministic approaches for pluvial flood risk assessment.

Estimating a graphical intra-class correlation coefficient (GICC) using multivariate probit-linear mixed models.

PubMed

Yue, Chen; Chen, Shaojie; Sair, Haris I; Airan, Raag; Caffo, Brian S

2015-09-01

Data reproducibility is a critical issue in all scientific experiments. In this manuscript, the problem of quantifying the reproducibility of graphical measurements is considered. The image intra-class correlation coefficient (I2C2) is generalized and the graphical intra-class correlation coefficient (GICC) is proposed for such purpose. The concept for GICC is based on multivariate probit-linear mixed effect models. A Markov Chain Monte Carlo EM (mcm-cEM) algorithm is used for estimating the GICC. Simulation results with varied settings are demonstrated and our method is applied to the KIRBY21 test-retest dataset.
Firefly algorithm versus genetic algorithm as powerful variable selection tools and their effect on different multivariate calibration models in spectroscopy: A comparative study

NASA Astrophysics Data System (ADS)

Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed

2017-01-01

For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration.
Reparametrization-based estimation of genetic parameters in multi-trait animal model using Integrated Nested Laplace Approximation.

PubMed

Mathew, Boby; Holand, Anna Marie; Koistinen, Petri; Léon, Jens; Sillanpää, Mikko J

2016-02-01

A novel reparametrization-based INLA approach as a fast alternative to MCMC for the Bayesian estimation of genetic parameters in multivariate animal model is presented. Multi-trait genetic parameter estimation is a relevant topic in animal and plant breeding programs because multi-trait analysis can take into account the genetic correlation between different traits and that significantly improves the accuracy of the genetic parameter estimates. Generally, multi-trait analysis is computationally demanding and requires initial estimates of genetic and residual correlations among the traits, while those are difficult to obtain. In this study, we illustrate how to reparametrize covariance matrices of a multivariate animal model/animal models using modified Cholesky decompositions. This reparametrization-based approach is used in the Integrated Nested Laplace Approximation (INLA) methodology to estimate genetic parameters of multivariate animal model. Immediate benefits are: (1) to avoid difficulties of finding good starting values for analysis which can be a problem, for example in Restricted Maximum Likelihood (REML); (2) Bayesian estimation of (co)variance components using INLA is faster to execute than using Markov Chain Monte Carlo (MCMC) especially when realized relationship matrices are dense. The slight drawback is that priors for covariance matrices are assigned for elements of the Cholesky factor but not directly to the covariance matrix elements as in MCMC. Additionally, we illustrate the concordance of the INLA results with the traditional methods like MCMC and REML approaches. We also present results obtained from simulated data sets with replicates and field data in rice.
A climate-based multivariate extreme emulator of met-ocean-hydrological events for coastal flooding

NASA Astrophysics Data System (ADS)

Camus, Paula; Rueda, Ana; Mendez, Fernando J.; Tomas, Antonio; Del Jesus, Manuel; Losada, Iñigo J.

2015-04-01

Atmosphere-ocean general circulation models (AOGCMs) are useful to analyze large-scale climate variability (long-term historical periods, future climate projections). However, applications such as coastal flood modeling require climate information at finer scale. Besides, flooding events depend on multiple climate conditions: waves, surge levels from the open-ocean and river discharge caused by precipitation. Therefore, a multivariate statistical downscaling approach is adopted to reproduce relationships between variables and due to its low computational cost. The proposed method can be considered as a hybrid approach which combines a probabilistic weather type downscaling model with a stochastic weather generator component. Predictand distributions are reproduced modeling the relationship with AOGCM predictors based on a physical division in weather types (Camus et al., 2012). The multivariate dependence structure of the predictand (extreme events) is introduced linking the independent marginal distributions of the variables by a probabilistic copula regression (Ben Ayala et al., 2014). This hybrid approach is applied for the downscaling of AOGCM data to daily precipitation and maximum significant wave height and storm-surge in different locations along the Spanish coast. Reanalysis data is used to assess the proposed method. A commonly predictor for the three variables involved is classified using a regression-guided clustering algorithm. The most appropriate statistical model (general extreme value distribution, pareto distribution) for daily conditions is fitted. Stochastic simulation of the present climate is performed obtaining the set of hydraulic boundary conditions needed for high resolution coastal flood modeling. References: Camus, P., Menéndez, M., Méndez, F.J., Izaguirre, C., Espejo, A., Cánovas, V., Pérez, J., Rueda, A., Losada, I.J., Medina, R. (2014b). A weather-type statistical downscaling framework for ocean wave climate. Journal of Geophysical Research, doi: 10.1002/2014JC010141. Ben Ayala, M.A., Chebana, F., Ouarda, T.B.M.J. (2014). Probabilistic Gaussian Copula Regression Model for Multisite and Multivariable Downscaling, Journal of Climate, 27, 3331-3347.
Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

PubMed

Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

2016-01-01

Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Data-based virtual unmodeled dynamics driven multivariable nonlinear adaptive switching control.

PubMed

Chai, Tianyou; Zhang, Yajun; Wang, Hong; Su, Chun-Yi; Sun, Jing

2011-12-01

For a complex industrial system, its multivariable and nonlinear nature generally make it very difficult, if not impossible, to obtain an accurate model, especially when the model structure is unknown. The control of this class of complex systems is difficult to handle by the traditional controller designs around their operating points. This paper, however, explores the concepts of controller-driven model and virtual unmodeled dynamics to propose a new design framework. The design consists of two controllers with distinct functions. First, using input and output data, a self-tuning controller is constructed based on a linear controller-driven model. Then the output signals of the controller-driven model are compared with the true outputs of the system to produce so-called virtual unmodeled dynamics. Based on the compensator of the virtual unmodeled dynamics, the second controller based on a nonlinear controller-driven model is proposed. Those two controllers are integrated by an adaptive switching control algorithm to take advantage of their complementary features: one offers stabilization function and another provides improved performance. The conditions on the stability and convergence of the closed-loop system are analyzed. Both simulation and experimental tests on a heavily coupled nonlinear twin-tank system are carried out to confirm the effectiveness of the proposed method.
Climate Model Diagnostic Analyzer

NASA Technical Reports Server (NTRS)

Lee, Seungwon; Pan, Lei; Zhai, Chengxing; Tang, Benyang; Kubar, Terry; Zhang, Zia; Wang, Wei

2015-01-01

The comprehensive and innovative evaluation of climate models with newly available global observations is critically needed for the improvement of climate model current-state representation and future-state predictability. A climate model diagnostic evaluation process requires physics-based multi-variable analyses that typically involve large-volume and heterogeneous datasets, making them both computation- and data-intensive. With an exploratory nature of climate data analyses and an explosive growth of datasets and service tools, scientists are struggling to keep track of their datasets, tools, and execution/study history, let alone sharing them with others. In response, we have developed a cloud-enabled, provenance-supported, web-service system called Climate Model Diagnostic Analyzer (CMDA). CMDA enables the physics-based, multivariable model performance evaluations and diagnoses through the comprehensive and synergistic use of multiple observational data, reanalysis data, and model outputs. At the same time, CMDA provides a crowd-sourcing space where scientists can organize their work efficiently and share their work with others. CMDA is empowered by many current state-of-the-art software packages in web service, provenance, and semantic search.
A randomised approach for NARX model identification based on a multivariate Bernoulli distribution

NASA Astrophysics Data System (ADS)

Bianchi, F.; Falsone, A.; Prandini, M.; Piroddi, L.

2017-04-01

The identification of polynomial NARX models is typically performed by incremental model building techniques. These methods assess the importance of each regressor based on the evaluation of partial individual models, which may ultimately lead to erroneous model selections. A more robust assessment of the significance of a specific model term can be obtained by considering ensembles of models, as done by the RaMSS algorithm. In that context, the identification task is formulated in a probabilistic fashion and a Bernoulli distribution is employed to represent the probability that a regressor belongs to the target model. Then, samples of the model distribution are collected to gather reliable information to update it, until convergence to a specific model. The basic RaMSS algorithm employs multiple independent univariate Bernoulli distributions associated to the different candidate model terms, thus overlooking the correlations between different terms, which are typically important in the selection process. Here, a multivariate Bernoulli distribution is employed, in which the sampling of a given term is conditioned by the sampling of the others. The added complexity inherent in considering the regressor correlation properties is more than compensated by the achievable improvements in terms of accuracy of the model selection process.
Multivariate Meta-Analysis of Preference-Based Quality of Life Values in Coronary Heart Disease.

PubMed

Stevanović, Jelena; Pechlivanoglou, Petros; Kampinga, Marthe A; Krabbe, Paul F M; Postma, Maarten J

2016-01-01

There are numerous health-related quality of life (HRQol) measurements used in coronary heart disease (CHD) in the literature. However, only values assessed with preference-based instruments can be directly applied in a cost-utility analysis (CUA). To summarize and synthesize instrument-specific preference-based values in CHD and the underlying disease-subgroups, stable angina and post-acute coronary syndrome (post-ACS), for developed countries, while accounting for study-level characteristics, and within- and between-study correlation. A systematic review was conducted to identify studies reporting preference-based values in CHD. A multivariate meta-analysis was applied to synthesize the HRQoL values. Meta-regression analyses examined the effect of study level covariates age, publication year, prevalence of diabetes and gender. A total of 40 studies providing preference-based values were detected. Synthesized estimates of HRQoL in post-ACS ranged from 0.64 (Quality of Well-Being) to 0.92 (EuroQol European"tariff"), while in stable angina they ranged from 0.64 (Short form 6D) to 0.89 (Standard Gamble). Similar findings were observed in estimates applying to general CHD. No significant improvement in model fit was found after adjusting for study-level covariates. Large between-study heterogeneity was observed in all the models investigated. The main finding of our study is the presence of large heterogeneity both within and between instrument-specific HRQoL values. Current economic models in CHD ignore this between-study heterogeneity. Multivariate meta-analysis can quantify this heterogeneity and offers the means for uncertainty around HRQoL values to be translated to uncertainty in CUAs.
Wind Turbine Load Mitigation based on Multivariable Robust Control and Blade Root Sensors

NASA Astrophysics Data System (ADS)

Díaz de Corcuera, A.; Pujana-Arrese, A.; Ezquerra, J. M.; Segurola, E.; Landaluze, J.

2014-12-01

This paper presents two H∞ multivariable robust controllers based on blade root sensors' information for individual pitch angle control. The wind turbine of 5 MW defined in the Upwind European project is the reference non-linear model used in this research work, which has been modelled in the GH Bladed 4.0 software package. The main objective of these controllers is load mitigation in different components of wind turbines during power production in the above rated control zone. The first proposed multi-input multi-output (MIMO) individual pitch H" controller mitigates the wind effect on the tower side-to-side acceleration and reduces the asymmetrical loads which appear in the rotor due to its misalignment. The second individual pitch H" multivariable controller mitigates the loads on the three blades reducing the wind effect on the bending flapwise and edgewise momentums in the blades. The designed H" controllers have been validated in GH Bladed and an exhaustive analysis has been carried out to calculate fatigue load reduction on wind turbine components, as well as to analyze load mitigation in some extreme cases.
Uni- and multi-variable modelling of flood losses: experiences gained from the Secchia river inundation event.

NASA Astrophysics Data System (ADS)

Carisi, Francesca; Domeneghetti, Alessio; Kreibich, Heidi; Schröter, Kai; Castellarin, Attilio

2017-04-01

Flood risk is function of flood hazard and vulnerability, therefore its accurate assessment depends on a reliable quantification of both factors. The scientific literature proposes a number of objective and reliable methods for assessing flood hazard, yet it highlights a limited understanding of the fundamental damage processes. Loss modelling is associated with large uncertainty which is, among other factors, due to a lack of standard procedures; for instance, flood losses are often estimated based on damage models derived in completely different contexts (i.e. different countries or geographical regions) without checking its applicability, or by considering only one explanatory variable (i.e. typically water depth). We consider the Secchia river flood event of January 2014, when a sudden levee-breach caused the inundation of nearly 200 km2 in Northern Italy. In the aftermath of this event, local authorities collected flood loss data, together with additional information on affected private households and industrial activities (e.g. buildings surface and economic value, number of company's employees and others). Based on these data we implemented and compared a quadratic-regression damage function, with water depth as the only explanatory variable, and a multi-variable model that combines multiple regression trees and considers several explanatory variables (i.e. bagging decision trees). Our results show the importance of data collection revealing that (1) a simple quadratic regression damage function based on empirical data from the study area can be significantly more accurate than literature damage-models derived for a different context and (2) multi-variable modelling may outperform the uni-variable approach, yet it is more difficult to develop and apply due to a much higher demand of detailed data.
Non-parametric identification of multivariable systems: A local rational modeling approach with application to a vibration isolation benchmark

NASA Astrophysics Data System (ADS)

Voorhoeve, Robbert; van der Maas, Annemiek; Oomen, Tom

2018-05-01

Frequency response function (FRF) identification is often used as a basis for control systems design and as a starting point for subsequent parametric system identification. The aim of this paper is to develop a multiple-input multiple-output (MIMO) local parametric modeling approach for FRF identification of lightly damped mechanical systems with improved speed and accuracy. The proposed method is based on local rational models, which can efficiently handle the lightly-damped resonant dynamics. A key aspect herein is the freedom in the multivariable rational model parametrizations. Several choices for such multivariable rational model parametrizations are proposed and investigated. For systems with many inputs and outputs the required number of model parameters can rapidly increase, adversely affecting the performance of the local modeling approach. Therefore, low-order model structures are investigated. The structure of these low-order parametrizations leads to an undesired directionality in the identification problem. To address this, an iterative local rational modeling algorithm is proposed. As a special case recently developed SISO algorithms are recovered. The proposed approach is successfully demonstrated on simulations and on an active vibration isolation system benchmark, confirming good performance of the method using significantly less parameters compared with alternative approaches.
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.

PubMed

Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P

2015-11-01

To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Geometric Model of Induction Heating Process of Iron-Based Sintered Materials

NASA Astrophysics Data System (ADS)

Semagina, Yu V.; Egorova, M. A.

2018-03-01

The article studies the issue of building multivariable dependences based on the experimental data. A constructive method for solving the issue is presented in the form of equations of (n-1) – surface compartments of the extended Euclidean space E+n. The dimension of space is taken to be equal to the sum of the number of parameters and factors of the model of the system being studied. The basis for building multivariable dependencies is the generalized approach to n-space used for the surface compartments of 3D space. The surface is designed on the basis of the kinematic method, moving one geometric object along a certain trajectory. The proposed approach simplifies the process aimed at building the multifactorial empirical dependencies which describe the process being investigated.
Multivariate Prediction Equations for HbA1c Lowering, Weight Change, and Hypoglycemic Events Associated with Insulin Rescue Medication in Type 2 Diabetes Mellitus: Informing Economic Modeling.

PubMed

Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt

2017-03-01

Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
A trust region approach with multivariate Padé model for optimal circuit design

NASA Astrophysics Data System (ADS)

Abdel-Malek, Hany L.; Ebid, Shaimaa E. K.; Mohamed, Ahmed S. A.

2017-11-01

Since the optimization process requires a significant number of consecutive function evaluations, it is recommended to replace the function by an easily evaluated approximation model during the optimization process. The model suggested in this article is based on a multivariate Padé approximation. This model is constructed using data points of ?, where ? is the number of parameters. The model is updated over a sequence of trust regions. This model avoids the slow convergence of linear models of ? and has features of quadratic models that need interpolation data points of ?. The proposed approach is tested by applying it to several benchmark problems. Yield optimization using such a direct method is applied to some practical circuit examples. Minimax solution leads to a suitable initial point to carry out the yield optimization process. The yield is optimized by the proposed derivative-free method for active and passive filter examples.
Community-Based Addiction Treatment Staff Attitudes about the Usefulness of Evidence-Based Addiction Treatment and CBO Organizational Linkages to Research Institutions

ERIC Educational Resources Information Center

Lundgren, Lena; Krull, Ivy; Zerden, Lisa de Saxe; McCarty, Dennis

2011-01-01

This national study of community-based addiction-treatment organizations' (CBOs) implementation of evidence-based practices explored CBO Program Directors' (n = 296) and clinical staff (n = 518) attitudes about the usefulness of science-based addiction treatment. Through multivariable regression modeling, the study identified that identical…
Functional MRI and Multivariate Autoregressive Models

PubMed Central

Rogers, Baxter P.; Katwal, Santosh B.; Morgan, Victoria L.; Asplund, Christopher L.; Gore, John C.

2010-01-01

Connectivity refers to the relationships that exist between different regions of the brain. In the context of functional magnetic resonance imaging (fMRI), it implies a quantifiable relationship between hemodynamic signals from different regions. One aspect of this relationship is the existence of small timing differences in the signals in different regions. Delays of 100 ms or less may be measured with fMRI, and these may reflect important aspects of the manner in which brain circuits respond as well as the overall functional organization of the brain. The multivariate autoregressive time series model has features to recommend it for measuring these delays, and is straightforward to apply to hemodynamic data. In this review, we describe the current usage of the multivariate autoregressive model for fMRI, discuss the issues that arise when it is applied to hemodynamic time series, and consider several extensions. Connectivity measures like Granger causality that are based on the autoregressive model do not always reflect true neuronal connectivity; however, we conclude that careful experimental design could make this methodology quite useful in extending the information obtainable using fMRI. PMID:20444566
Combined Prediction Model of Death Toll for Road Traffic Accidents Based on Independent and Dependent Variables

PubMed Central

Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang

2014-01-01

In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Combined prediction model of death toll for road traffic accidents based on independent and dependent variables.

PubMed

Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan

2014-01-01

In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.

Modeling the Drift Towards Sex Role Deviance.

ERIC Educational Resources Information Center

James, Jennifer; Vitaliano, Peter Paul

The interrelationships of deviant life experiences and current status, i.e., prostitution versus non-prostitution, were investigated by the application of multivariate analyses. Variables were studied involving early home life, pregnancy history, sexual history, and criminal involvement. Based on the analyses, three models were developed that…
A Bayesian Multi-Level Factor Analytic Model of Consumer Price Sensitivities across Categories

ERIC Educational Resources Information Center

Duvvuri, Sri Devi; Gruca, Thomas S.

2010-01-01

Identifying price sensitive consumers is an important problem in marketing. We develop a Bayesian multi-level factor analytic model of the covariation among household-level price sensitivities across product categories that are substitutes. Based on a multivariate probit model of category incidence, this framework also allows the researcher to…
Quantitative monitoring of sucrose, reducing sugar and total sugar dynamics for phenotyping of water-deficit stress tolerance in rice through spectroscopy and chemometrics

NASA Astrophysics Data System (ADS)

Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini

2018-03-01

In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

NASA Astrophysics Data System (ADS)

Pradhan, Biswajeet

2010-05-01

This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

NASA Technical Reports Server (NTRS)

MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

2005-01-01

Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Copula-based analysis of rhythm

NASA Astrophysics Data System (ADS)

García, J. E.; González-López, V. A.; Viola, M. L. Lanfredi

2016-06-01

In this paper we establish stochastic profiles of the rhythm for three languages: English, Japanese and Spanish. We model the increase or decrease of the acoustical energy, collected into three bands coming from the acoustic signal. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination of the partitions corresponding to the three marginal processes, one for each band of energy, and the partition coming from to the multivariate Markov chain. Then, all the partitions are linked using a copula, in order to estimate the transition probabilities.
An effective drift correction for dynamical downscaling of decadal global climate predictions

NASA Astrophysics Data System (ADS)

Paeth, Heiko; Li, Jingmin; Pollinger, Felix; Müller, Wolfgang A.; Pohlmann, Holger; Feldmann, Hendrik; Panitz, Hans-Jürgen

2018-04-01

Initialized decadal climate predictions with coupled climate models are often marked by substantial climate drifts that emanate from a mismatch between the climatology of the coupled model system and the data set used for initialization. While such drifts may be easily removed from the prediction system when analyzing individual variables, a major problem prevails for multivariate issues and, especially, when the output of the global prediction system shall be used for dynamical downscaling. In this study, we present a statistical approach to remove climate drifts in a multivariate context and demonstrate the effect of this drift correction on regional climate model simulations over the Euro-Atlantic sector. The statistical approach is based on an empirical orthogonal function (EOF) analysis adapted to a very large data matrix. The climate drift emerges as a dramatic cooling trend in North Atlantic sea surface temperatures (SSTs) and is captured by the leading EOF of the multivariate output from the global prediction system, accounting for 7.7% of total variability. The SST cooling pattern also imposes drifts in various atmospheric variables and levels. The removal of the first EOF effectuates the drift correction while retaining other components of intra-annual, inter-annual and decadal variability. In the regional climate model, the multivariate drift correction of the input data removes the cooling trends in most western European land regions and systematically reduces the discrepancy between the output of the regional climate model and observational data. In contrast, removing the drift only in the SST field from the global model has hardly any positive effect on the regional climate model.
A multivariate time series approach to modeling and forecasting demand in the emergency department.

PubMed

Jones, Spencer S; Evans, R Scott; Allen, Todd L; Thomas, Alun; Haug, Peter J; Welch, Shari J; Snow, Gregory L

2009-02-01

The goals of this investigation were to study the temporal relationships between the demands for key resources in the emergency department (ED) and the inpatient hospital, and to develop multivariate forecasting models. Hourly data were collected from three diverse hospitals for the year 2006. Descriptive analysis and model fitting were carried out using graphical and multivariate time series methods. Multivariate models were compared to a univariate benchmark model in terms of their ability to provide out-of-sample forecasts of ED census and the demands for diagnostic resources. Descriptive analyses revealed little temporal interaction between the demand for inpatient resources and the demand for ED resources at the facilities considered. Multivariate models provided more accurate forecasts of ED census and of the demands for diagnostic resources. Our results suggest that multivariate time series models can be used to reliably forecast ED patient census; however, forecasts of the demands for diagnostic resources were not sufficiently reliable to be useful in the clinical setting.
Firefly algorithm versus genetic algorithm as powerful variable selection tools and their effect on different multivariate calibration models in spectroscopy: A comparative study.

PubMed

Attia, Khalid A M; Nassar, Mohammed W I; El-Zeiny, Mohamed B; Serag, Ahmed

2017-01-05

For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration. Copyright © 2016 Elsevier B.V. All rights reserved.
Applying Multivariate Discrete Distributions to Genetically Informative Count Data.

PubMed

Kirkpatrick, Robert M; Neale, Michael C

2016-03-01

We present a novel method of conducting biometric analysis of twin data when the phenotypes are integer-valued counts, which often show an L-shaped distribution. Monte Carlo simulation is used to compare five likelihood-based approaches to modeling: our multivariate discrete method, when its distributional assumptions are correct, when they are incorrect, and three other methods in common use. With data simulated from a skewed discrete distribution, recovery of twin correlations and proportions of additive genetic and common environment variance was generally poor for the Normal, Lognormal and Ordinal models, but good for the two discrete models. Sex-separate applications to substance-use data from twins in the Minnesota Twin Family Study showed superior performance of two discrete models. The new methods are implemented using R and OpenMx and are freely available.
Supporting inquiry learning by promoting normative understanding of multivariable causality

NASA Astrophysics Data System (ADS)

Keselman, Alla

2003-11-01

Early adolescents may lack the cognitive and metacognitive skills necessary for effective inquiry learning. In particular, they are likely to have a nonnormative mental model of multivariable causality in which effects of individual variables are neither additive nor consistent. Described here is a software-based intervention designed to facilitate students' metalevel and performance-level inquiry skills by enhancing their understanding of multivariable causality. Relative to an exploration-only group, sixth graders who practiced predicting an outcome (earthquake risk) based on multiple factors demonstrated increased attention to evidence, improved metalevel appreciation of effective strategies, and a trend toward consistent use of a controlled comparison strategy. Sixth graders who also received explicit instruction in making predictions based on multiple factors showed additional improvement in their ability to compare multiple instances as a basis for inferences and constructed the most accurate knowledge of the system. Gains were maintained in transfer tasks. The cognitive skills and metalevel understanding examined here are essential to inquiry learning.
An alternative derivation of the stationary distribution of the multivariate neutral Wright-Fisher model for low mutation rates with a view to mutation rate estimation from site frequency data.

PubMed

Schrempf, Dominik; Hobolth, Asger

2017-04-01

Recently, Burden and Tang (2016) provided an analytical expression for the stationary distribution of the multivariate neutral Wright-Fisher model with low mutation rates. In this paper we present a simple, alternative derivation that illustrates the approximation. Our proof is based on the discrete multivariate boundary mutation model which has three key ingredients. First, the decoupled Moran model is used to describe genetic drift. Second, low mutation rates are assumed by limiting mutations to monomorphic states. Third, the mutation rate matrix is separated into a time-reversible part and a flux part, as suggested by Burden and Tang (2016). An application of our result to data from several great apes reveals that the assumption of stationarity may be inadequate or that other evolutionary forces like selection or biased gene conversion are acting. Furthermore we find that the model with a reversible mutation rate matrix provides a reasonably good fit to the data compared to the one with a non-reversible mutation rate matrix. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Multivariable Time Series Prediction for the Icing Process on Overhead Power Transmission Line

PubMed Central

Li, Peng; Zhao, Na; Zhou, Donghua; Cao, Min; Li, Jingjie; Shi, Xinling

2014-01-01

The design of monitoring and predictive alarm systems is necessary for successful overhead power transmission line icing. Given the characteristics of complexity, nonlinearity, and fitfulness in the line icing process, a model based on a multivariable time series is presented here to predict the icing load of a transmission line. In this model, the time effects of micrometeorology parameters for the icing process have been analyzed. The phase-space reconstruction theory and machine learning method were then applied to establish the prediction model, which fully utilized the history of multivariable time series data in local monitoring systems to represent the mapping relationship between icing load and micrometeorology factors. Relevant to the characteristic of fitfulness in line icing, the simulations were carried out during the same icing process or different process to test the model's prediction precision and robustness. According to the simulation results for the Tao-Luo-Xiong Transmission Line, this model demonstrates a good accuracy of prediction in different process, if the prediction length is less than two hours, and would be helpful for power grid departments when deciding to take action in advance to address potential icing disasters. PMID:25136653
A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

PubMed

Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

2016-04-01

Existing evidence suggests that ambient ultrafine particles (UFPs) (<0.1µm) may contribute to acute cardiorespiratory morbidity. However, few studies have examined the long-term health effects of these pollutants owing in part to a need for exposure surfaces that can be applied in large population-based studies. To address this need, we developed a land use regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model

NASA Astrophysics Data System (ADS)

Arumugam, S.; Libera, D.

2017-12-01

Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.
Model based multivariable controller for large scale compression stations. Design and experimental validation on the LHC 18KW cryorefrigerator

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bonne, François; Bonnay, Patrick; Alamir, Mazen

2014-01-29

In this paper, a multivariable model-based non-linear controller for Warm Compression Stations (WCS) is proposed. The strategy is to replace all the PID loops controlling the WCS with an optimally designed model-based multivariable loop. This new strategy leads to high stability and fast disturbance rejection such as those induced by a turbine or a compressor stop, a key-aspect in the case of large scale cryogenic refrigeration. The proposed control scheme can be used to have precise control of every pressure in normal operation or to stabilize and control the cryoplant under high variation of thermal loads (such as a pulsedmore » heat load expected to take place in future fusion reactors such as those expected in the cryogenic cooling systems of the International Thermonuclear Experimental Reactor ITER or the Japan Torus-60 Super Advanced fusion experiment JT-60SA). The paper details how to set the WCS model up to synthesize the Linear Quadratic Optimal feedback gain and how to use it. After preliminary tuning at CEA-Grenoble on the 400W@1.8K helium test facility, the controller has been implemented on a Schneider PLC and fully tested first on the CERN's real-time simulator. Then, it was experimentally validated on a real CERN cryoplant. The efficiency of the solution is experimentally assessed using a reasonable operating scenario of start and stop of compressors and cryogenic turbines. This work is partially supported through the European Fusion Development Agreement (EFDA) Goal Oriented Training Program, task agreement WP10-GOT-GIRO.« less
Model based multivariable controller for large scale compression stations. Design and experimental validation on the LHC 18KW cryorefrigerator

NASA Astrophysics Data System (ADS)

Bonne, François; Alamir, Mazen; Bonnay, Patrick; Bradu, Benjamin

2014-01-01

In this paper, a multivariable model-based non-linear controller for Warm Compression Stations (WCS) is proposed. The strategy is to replace all the PID loops controlling the WCS with an optimally designed model-based multivariable loop. This new strategy leads to high stability and fast disturbance rejection such as those induced by a turbine or a compressor stop, a key-aspect in the case of large scale cryogenic refrigeration. The proposed control scheme can be used to have precise control of every pressure in normal operation or to stabilize and control the cryoplant under high variation of thermal loads (such as a pulsed heat load expected to take place in future fusion reactors such as those expected in the cryogenic cooling systems of the International Thermonuclear Experimental Reactor ITER or the Japan Torus-60 Super Advanced fusion experiment JT-60SA). The paper details how to set the WCS model up to synthesize the Linear Quadratic Optimal feedback gain and how to use it. After preliminary tuning at CEA-Grenoble on the 400W@1.8K helium test facility, the controller has been implemented on a Schneider PLC and fully tested first on the CERN's real-time simulator. Then, it was experimentally validated on a real CERN cryoplant. The efficiency of the solution is experimentally assessed using a reasonable operating scenario of start and stop of compressors and cryogenic turbines. This work is partially supported through the European Fusion Development Agreement (EFDA) Goal Oriented Training Program, task agreement WP10-GOT-GIRO.
Use of collateral information to improve LANDSAT classification accuracies

NASA Technical Reports Server (NTRS)

Strahler, A. H. (Principal Investigator)

1981-01-01

Methods to improve LANDSAT classification accuracies were investigated including: (1) the use of prior probabilities in maximum likelihood classification as a methodology to integrate discrete collateral data with continuously measured image density variables; (2) the use of the logit classifier as an alternative to multivariate normal classification that permits mixing both continuous and categorical variables in a single model and fits empirical distributions of observations more closely than the multivariate normal density function; and (3) the use of collateral data in a geographic information system as exercised to model a desired output information layer as a function of input layers of raster format collateral and image data base layers.
Iterative procedures for space shuttle main engine performance models

NASA Technical Reports Server (NTRS)

Santi, L. Michael

1989-01-01

Performance models of the Space Shuttle Main Engine (SSME) contain iterative strategies for determining approximate solutions to nonlinear equations reflecting fundamental mass, energy, and pressure balances within engine flow systems. Both univariate and multivariate Newton-Raphson algorithms are employed in the current version of the engine Test Information Program (TIP). Computational efficiency and reliability of these procedures is examined. A modified trust region form of the multivariate Newton-Raphson method is implemented and shown to be superior for off nominal engine performance predictions. A heuristic form of Broyden's Rank One method is also tested and favorable results based on this algorithm are presented.
An assessment on the use of bivariate, multivariate and soft computing techniques for collapse susceptibility in GIS environ

NASA Astrophysics Data System (ADS)

Yilmaz, Işik; Marschalko, Marian; Bednarik, Martin

2013-04-01

The paper presented herein compares and discusses the use of bivariate, multivariate and soft computing techniques for collapse susceptibility modelling. Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index (TWI), stream power index (SPI), Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from the models, and they were then compared by means of their validations. However, Area Under Curve (AUC) values obtained from all three models showed that the map obtained from soft computing (ANN) model looks like more accurate than the other models, accuracies of all three models can be evaluated relatively similar. The results also showed that the conditional probability is an essential method in preparation of collapse susceptibility map and highly compatible with GIS operating features.

Fasting Glucose, Obesity, and Coronary Artery Calcification in Community-Based People Without Diabetes

PubMed Central

Rutter, Martin K.; Massaro, Joseph M.; Hoffmann, Udo; O’Donnell, Christopher J.; Fox, Caroline S.

2012-01-01

OBJECTIVE Our objective was to assess whether impaired fasting glucose (IFG) and obesity are independently related to coronary artery calcification (CAC) in a community-based population. RESEARCH DESIGN AND METHODS We assessed CAC using multidetector computed tomography in 3,054 Framingham Heart Study participants (mean [SD] age was 50 [10] years, 49% were women, 29% had IFG, and 25% were obese) free from known vascular disease or diabetes. We tested the hypothesis that IFG (5.6–6.9 mmol/L) and obesity (BMI ≥30 kg/m2) were independently associated with high CAC (>90th percentile for age and sex) after adjusting for hypertension, lipids, smoking, and medication. RESULTS High CAC was significantly related to IFG in an age- and sex-adjusted model (odds ratio 1.4 [95% CI 1.1–1.7], P = 0.002; referent: normal fasting glucose) and after further adjustment for obesity (1.3 [1.0–1.6], P = 0.045). However, IFG was not associated with high CAC in multivariable-adjusted models before (1.2 [0.9–1.4], P = 0.20) or after adjustment for obesity. Obesity was associated with high CAC in age- and sex-adjusted models (1.6 [1.3–2.0], P < 0.001) and in multivariable models that included IFG (1.4 [1.1–1.7], P = 0.005). Multivariable-adjusted spline regression models suggested nonlinear relationships linking high CAC with BMI (J-shaped), waist circumference (J-shaped), and fasting glucose. CONCLUSIONS In this community-based cohort, CAC was associated with obesity, but not IFG, after adjusting for important confounders. With the increasing worldwide prevalence of obesity and nondiabetic hyperglycemia, these data underscore the importance of obesity in the pathogenesis of CAC. PMID:22773705
Fasting glucose, obesity, and coronary artery calcification in community-based people without diabetes.

PubMed

Rutter, Martin K; Massaro, Joseph M; Hoffmann, Udo; O'Donnell, Christopher J; Fox, Caroline S

2012-09-01

Our objective was to assess whether impaired fasting glucose (IFG) and obesity are independently related to coronary artery calcification (CAC) in a community-based population. We assessed CAC using multidetector computed tomography in 3,054 Framingham Heart Study participants (mean [SD] age was 50 [10] years, 49% were women, 29% had IFG, and 25% were obese) free from known vascular disease or diabetes. We tested the hypothesis that IFG (5.6-6.9 mmol/L) and obesity (BMI ≥30 kg/m(2)) were independently associated with high CAC (>90th percentile for age and sex) after adjusting for hypertension, lipids, smoking, and medication. High CAC was significantly related to IFG in an age- and sex-adjusted model (odds ratio 1.4 [95% CI 1.1-1.7], P = 0.002; referent: normal fasting glucose) and after further adjustment for obesity (1.3 [1.0-1.6], P = 0.045). However, IFG was not associated with high CAC in multivariable-adjusted models before (1.2 [0.9-1.4], P = 0.20) or after adjustment for obesity. Obesity was associated with high CAC in age- and sex-adjusted models (1.6 [1.3-2.0], P < 0.001) and in multivariable models that included IFG (1.4 [1.1-1.7], P = 0.005). Multivariable-adjusted spline regression models suggested nonlinear relationships linking high CAC with BMI (J-shaped), waist circumference (J-shaped), and fasting glucose. In this community-based cohort, CAC was associated with obesity, but not IFG, after adjusting for important confounders. With the increasing worldwide prevalence of obesity and nondiabetic hyperglycemia, these data underscore the importance of obesity in the pathogenesis of CAC.
Non-proportional odds multivariate logistic regression of ordinal family data.

PubMed

Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C

2015-03-01

Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution

PubMed Central

Lo, Kenneth

2011-01-01

Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components. PMID:22125375
Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: an alternative to the skew-t distribution.

PubMed

Lo, Kenneth; Gottardo, Raphael

2012-01-01

Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.
Implementation of the Iterative Proportion Fitting Algorithm for Geostatistical Facies Modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li Yupeng, E-mail: yupeng@ualberta.ca; Deutsch, Clayton V.

2012-06-15

In geostatistics, most stochastic algorithm for simulation of categorical variables such as facies or rock types require a conditional probability distribution. The multivariate probability distribution of all the grouped locations including the unsampled location permits calculation of the conditional probability directly based on its definition. In this article, the iterative proportion fitting (IPF) algorithm is implemented to infer this multivariate probability. Using the IPF algorithm, the multivariate probability is obtained by iterative modification to an initial estimated multivariate probability using lower order bivariate probabilities as constraints. The imposed bivariate marginal probabilities are inferred from profiles along drill holes or wells.more » In the IPF process, a sparse matrix is used to calculate the marginal probabilities from the multivariate probability, which makes the iterative fitting more tractable and practical. This algorithm can be extended to higher order marginal probability constraints as used in multiple point statistics. The theoretical framework is developed and illustrated with estimation and simulation example.« less
Influence of professional preparation and class structure on sexuality topics taught in middle and high schools.

PubMed

Rhodes, Darson L; Kirchofer, Gregg; Hammig, Bart J; Ogletree, Roberta J

2013-05-01

This study examined the impact of professional preparation and class structure on sexuality topics taught and use of practice-based instructional strategies in US middle and high school health classes. Data from the classroom-level file of the 2006 School Health Policies and Programs were used. A series of multivariable logistic regression models were employed to determine if sexuality content taught was dependent on professional preparation and /or class structure (HE only versus HE/another subject combined). Additional multivariable logistic regression models were employed to determine if use of practice-based instructional strategies was dependent upon professional preparation and/or class structure. Years of teaching health topics and size of the school district were included as covariates in the multivariable logistic regression models. Findings indicated professionally prepared health educators were significantly more likely to teach 7 of the 13 sexuality topics as compared to nonprofessionally prepared health educators. There was no statistically significant difference in the instructional strategies used by professionally prepared and nonprofessionally prepared health educators. Exclusively health education classes versus combined classes were significantly more likely to have included 6 of the 13 topics and to have incorporated practice-based instructional strategies in the curricula. This study indicated professional preparation and class structure impacted sexuality content taught. Class structure also impacted whether opportunities for students to practice skills were made available. Results support the need for continued advocacy for professionally prepared health educators and health only courses. © 2013, American School Health Association.
Novel high-resolution computed tomography-based radiomic classifier for screen-identified pulmonary nodules in the National Lung Screening Trial.

PubMed

Peikert, Tobias; Duan, Fenghai; Rajagopalan, Srinivasan; Karwoski, Ronald A; Clay, Ryan; Robb, Richard A; Qin, Ziling; Sicks, JoRean; Bartholmai, Brian J; Maldonado, Fabien

2018-01-01

Optimization of the clinical management of screen-detected lung nodules is needed to avoid unnecessary diagnostic interventions. Herein we demonstrate the potential value of a novel radiomics-based approach for the classification of screen-detected indeterminate nodules. Independent quantitative variables assessing various radiologic nodule features such as sphericity, flatness, elongation, spiculation, lobulation and curvature were developed from the NLST dataset using 726 indeterminate nodules (all ≥ 7 mm, benign, n = 318 and malignant, n = 408). Multivariate analysis was performed using least absolute shrinkage and selection operator (LASSO) method for variable selection and regularization in order to enhance the prediction accuracy and interpretability of the multivariate model. The bootstrapping method was then applied for the internal validation and the optimism-corrected AUC was reported for the final model. Eight of the originally considered 57 quantitative radiologic features were selected by LASSO multivariate modeling. These 8 features include variables capturing Location: vertical location (Offset carina centroid z), Size: volume estimate (Minimum enclosing brick), Shape: flatness, Density: texture analysis (Score Indicative of Lesion/Lung Aggression/Abnormality (SILA) texture), and surface characteristics: surface complexity (Maximum shape index and Average shape index), and estimates of surface curvature (Average positive mean curvature and Minimum mean curvature), all with P<0.01. The optimism-corrected AUC for these 8 features is 0.939. Our novel radiomic LDCT-based approach for indeterminate screen-detected nodule characterization appears extremely promising however independent external validation is needed.
Organizational Change, Absenteeism, and Welfare Dependency

ERIC Educational Resources Information Center

Roed, Knut; Fevang, Elisabeth

2007-01-01

Based on Norwegian register data, we set up a multivariate mixed proportional hazard model (MMPH) to analyze nurses' pattern of work, sickness absence, nonemployment, and social insurance dependency from 1992 to 2000, and how that pattern was affected by workplace characteristics. The model is estimated by means of the nonparametric…
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Intharathirat, Rotchana, E-mail: rotchana.in@gmail.com; Abdul Salam, P., E-mail: salam@ait.ac.th; Kumar, S., E-mail: kumar@ait.ac.th

Highlights: • Grey model can be used to forecast MSW quantity accurately with the limited data. • Prediction interval overcomes the uncertainty of MSW forecast effectively. • A multivariate model gives accuracy associated with factors affecting MSW quantity. • Population, urbanization, employment and household size play role for MSW quantity. - Abstract: In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developingmore » countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435–44,994 tonnes per day in 2013 to 55,177–56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period.« less
New consensus multivariate models based on PLS and ANN studies of sigma-1 receptor antagonists.

PubMed

Oliveira, Aline A; Lipinski, Célio F; Pereira, Estevão B; Honorio, Kathia M; Oliveira, Patrícia R; Weber, Karen C; Romero, Roseli A F; de Sousa, Alexsandro G; da Silva, Albérico B F

2017-10-02

The treatment of neuropathic pain is very complex and there are few drugs approved for this purpose. Among the studied compounds in the literature, sigma-1 receptor antagonists have shown to be promising. In order to develop QSAR studies applied to the compounds of 1-arylpyrazole derivatives, multivariate analyses have been performed in this work using partial least square (PLS) and artificial neural network (ANN) methods. A PLS model has been obtained and validated with 45 compounds in the training set and 13 compounds in the test set (r 2 training = 0.761, q 2 = 0.656, r 2 test = 0.746, MSE test = 0.132 and MAE test = 0.258). Additionally, multi-layer perceptron ANNs (MLP-ANNs) were employed in order to propose non-linear models trained by gradient descent with momentum backpropagation function. Based on MSE test values, the best MLP-ANN models were combined in a MLP-ANN consensus model (MLP-ANN-CM; r 2 test = 0.824, MSE test = 0.088 and MAE test = 0.197). In the end, a general consensus model (GCM) has been obtained using PLS and MLP-ANN-CM models (r 2 test = 0.811, MSE test = 0.100 and MAE test = 0.218). Besides, the selected descriptors (GGI6, Mor23m, SRW06, H7m, MLOGP, and μ) revealed important features that should be considered when one is planning new compounds of the 1-arylpyrazole class. The multivariate models proposed in this work are definitely a powerful tool for the rational drug design of new compounds for neuropathic pain treatment. Graphical abstract Main scaffold of the 1-arylpyrazole derivatives and the selected descriptors.
Multivariate functions for predicting the sorption of 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-tricyclohexane (RDX) among taxonomically distinct soils.

PubMed

Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F

2016-11-01

After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil's taxonomic designation would have direct benefit to military range managers seeking to anticipate the environmental risks of training activities on impact sites. Published by Elsevier Ltd.
A Personalized Predictive Framework for Multivariate Clinical Time Series via Adaptive Model Selection.

PubMed

Liu, Zitao; Hauskrecht, Milos

2017-11-01

Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is not straightforward. First, patient-specific variations are typically large and population-based models derived or learned from many different patients are often unable to support accurate predictions for each individual patient. Moreover, time series observed for one patient at any point in time may be too short and insufficient to learn a high-quality patient-specific model just from the patient's own data. To address these problems we propose, develop and experiment with a new adaptive forecasting framework for building multivariate clinical time series models for a patient and for supporting patient-specific predictions. The framework relies on the adaptive model switching approach that at any point in time selects the most promising time series model out of the pool of many possible models, and consequently, combines advantages of the population, patient-specific and short-term individualized predictive models. We demonstrate that the adaptive model switching framework is very promising approach to support personalized time series prediction, and that it is able to outperform predictions based on pure population and patient-specific models, as well as, other patient-specific model adaptation strategies.
Multivariate methods for evaluating the efficiency of electrodialytic removal of heavy metals from polluted harbour sediments.

PubMed

Pedersen, Kristine Bondo; Kirkelund, Gunvor M; Ottosen, Lisbeth M; Jensen, Pernille E; Lejon, Tore

2015-01-01

Chemometrics was used to develop a multivariate model based on 46 previously reported electrodialytic remediation experiments (EDR) of five different harbour sediments. The model predicted final concentrations of Cd, Cu, Pb and Zn as a function of current density, remediation time, stirring rate, dry/wet sediment, cell set-up as well as sediment properties. Evaluation of the model showed that remediation time and current density had the highest comparative influence on the clean-up levels. Individual models for each heavy metal showed variance in the variable importance, indicating that the targeted heavy metals were bound to different sediment fractions. Based on the results, a PLS model was used to design five new EDR experiments of a sixth sediment to achieve specified clean-up levels of Cu and Pb. The removal efficiencies were up to 82% for Cu and 87% for Pb and the targeted clean-up levels were met in four out of five experiments. The clean-up levels were better than predicted by the model, which could hence be used for predicting an approximate remediation strategy; the modelling power will however improve with more data included. Copyright © 2014 Elsevier B.V. All rights reserved.
Development of multivariate NTCP models for radiation-induced hypothyroidism: a comparative analysis.

PubMed

Cella, Laura; Liuzzi, Raffaele; Conson, Manuel; D'Avino, Vittoria; Salvatore, Marco; Pacelli, Roberto

2012-12-27

Hypothyroidism is a frequent late side effect of radiation therapy of the cervical region. Purpose of this work is to develop multivariate normal tissue complication probability (NTCP) models for radiation-induced hypothyroidism (RHT) and to compare them with already existing NTCP models for RHT. Fifty-three patients treated with sequential chemo-radiotherapy for Hodgkin's lymphoma (HL) were retrospectively reviewed for RHT events. Clinical information along with thyroid gland dose distribution parameters were collected and their correlation to RHT was analyzed by Spearman's rank correlation coefficient (Rs). Multivariate logistic regression method using resampling methods (bootstrapping) was applied to select model order and parameters for NTCP modeling. Model performance was evaluated through the area under the receiver operating characteristic curve (AUC). Models were tested against external published data on RHT and compared with other published NTCP models. If we express the thyroid volume exceeding X Gy as a percentage (Vx(%)), a two-variable NTCP model including V30(%) and gender resulted to be the optimal predictive model for RHT (Rs = 0.615, p < 0.001. AUC = 0.87). Conversely, if absolute thyroid volume exceeding X Gy (Vx(cc)) was analyzed, an NTCP model based on 3 variables including V30(cc), thyroid gland volume and gender was selected as the most predictive model (Rs = 0.630, p < 0.001. AUC = 0.85). The three-variable model performs better when tested on an external cohort characterized by large inter-individuals variation in thyroid volumes (AUC = 0.914, 95% CI 0.760-0.984). A comparable performance was found between our model and that proposed in the literature based on thyroid gland mean dose and volume (p = 0.264). The absolute volume of thyroid gland exceeding 30 Gy in combination with thyroid gland volume and gender provide an NTCP model for RHT with improved prediction capability not only within our patient population but also in an external cohort.
Prediction of UT1-UTC, LOD and AAM χ3 by combination of least-squares and multivariate stochastic methods

NASA Astrophysics Data System (ADS)

Niedzielski, Tomasz; Kosek, Wiesław

2008-02-01

This article presents the application of a multivariate prediction technique for predicting universal time (UT1-UTC), length of day (LOD) and the axial component of atmospheric angular momentum (AAM χ 3). The multivariate predictions of LOD and UT1-UTC are generated by means of the combination of (1) least-squares (LS) extrapolation of models for annual, semiannual, 18.6-year, 9.3-year oscillations and for the linear trend, and (2) multivariate autoregressive (MAR) stochastic prediction of LS residuals (LS + MAR). The MAR technique enables the use of the AAM χ 3 time-series as the explanatory variable for the computation of LOD or UT1-UTC predictions. In order to evaluate the performance of this approach, two other prediction schemes are also applied: (1) LS extrapolation, (2) combination of LS extrapolation and univariate autoregressive (AR) prediction of LS residuals (LS + AR). The multivariate predictions of AAM χ 3 data, however, are computed as a combination of the extrapolation of the LS model for annual and semiannual oscillations and the LS + MAR. The AAM χ 3 predictions are also compared with LS extrapolation and LS + AR prediction. It is shown that the predictions of LOD and UT1-UTC based on LS + MAR taking into account the axial component of AAM are more accurate than the predictions of LOD and UT1-UTC based on LS extrapolation or on LS + AR. In particular, the UT1-UTC predictions based on LS + MAR during El Niño/La Niña events exhibit considerably smaller prediction errors than those calculated by means of LS or LS + AR. The AAM χ 3 time-series is predicted using LS + MAR with higher accuracy than applying LS extrapolation itself in the case of medium-term predictions (up to 100 days in the future). However, the predictions of AAM χ 3 reveal the best accuracy for LS + AR.
Multivariate Longitudinal Analysis with Bivariate Correlation Test.

PubMed

Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

2016-01-01

In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Clustering Multivariate Time Series Using Hidden Markov Models

PubMed Central

Ghassempour, Shima; Girosi, Federico; Maeder, Anthony

2014-01-01

In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs), where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers. PMID:24662996
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

PubMed Central

Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

2015-01-01

Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Cost Modeling for Space Telescope

NASA Technical Reports Server (NTRS)

Stahl, H. Philip

2011-01-01

Parametric cost models are an important tool for planning missions, compare concepts and justify technology investments. This paper presents on-going efforts to develop single variable and multi-variable cost models for space telescope optical telescope assembly (OTA). These models are based on data collected from historical space telescope missions. Standard statistical methods are used to derive CERs for OTA cost versus aperture diameter and mass. The results are compared with previously published models.

Microcomputer-based classification of environmental data in municipal areas

NASA Astrophysics Data System (ADS)

Thiergärtner, H.

1995-10-01

Multivariate data-processing methods used in mineral resource identification can be used to classify urban regions. Using elements of expert systems, geographical information systems, as well as known classification and prognosis systems, it is possible to outline a single model that consists of resistant and of temporary parts of a knowledge base including graphical input and output treatment and of resistant and temporary elements of a bank of methods and algorithms. Whereas decision rules created by experts will be stored in expert systems directly, powerful classification rules in form of resistant but latent (implicit) decision algorithms may be implemented in the suggested model. The latent functions will be transformed into temporary explicit decision rules by learning processes depending on the actual task(s), parameter set(s), pixels selection(s), and expert control(s). This takes place both at supervised and nonsupervised classification of multivariately described pixel sets representing municipal subareas. The model is outlined briefly and illustrated by results obtained in a target area covering a part of the city of Berlin (Germany).
Multivariate statistical model for 3D image segmentation with application to medical images.

PubMed

John, Nigel M; Kabuka, Mansur R; Ibrahim, Mohamed O

2003-12-01

In this article we describe a statistical model that was developed to segment brain magnetic resonance images. The statistical segmentation algorithm was applied after a pre-processing stage involving the use of a 3D anisotropic filter along with histogram equalization techniques. The segmentation algorithm makes use of prior knowledge and a probability-based multivariate model designed to semi-automate the process of segmentation. The algorithm was applied to images obtained from the Center for Morphometric Analysis at Massachusetts General Hospital as part of the Internet Brain Segmentation Repository (IBSR). The developed algorithm showed improved accuracy over the k-means, adaptive Maximum Apriori Probability (MAP), biased MAP, and other algorithms. Experimental results showing the segmentation and the results of comparisons with other algorithms are provided. Results are based on an overlap criterion against expertly segmented images from the IBSR. The algorithm produced average results of approximately 80% overlap with the expertly segmented images (compared with 85% for manual segmentation and 55% for other algorithms).
Bayesian multivariate Poisson abundance models for T-cell receptor data.

PubMed

Greene, Joshua; Birtwistle, Marc R; Ignatowicz, Leszek; Rempala, Grzegorz A

2013-06-07

A major feature of an adaptive immune system is its ability to generate B- and T-cell clones capable of recognizing and neutralizing specific antigens. These clones recognize antigens with the help of the surface molecules, called antigen receptors, acquired individually during the clonal development process. In order to ensure a response to a broad range of antigens, the number of different receptor molecules is extremely large, resulting in a huge clonal diversity of both B- and T-cell receptor populations and making their experimental comparisons statistically challenging. To facilitate such comparisons, we propose a flexible parametric model of multivariate count data and illustrate its use in a simultaneous analysis of multiple antigen receptor populations derived from mammalian T-cells. The model relies on a representation of the observed receptor counts as a multivariate Poisson abundance mixture (m PAM). A Bayesian parameter fitting procedure is proposed, based on the complete posterior likelihood, rather than the conditional one used typically in similar settings. The new procedure is shown to be considerably more efficient than its conditional counterpart (as measured by the Fisher information) in the regions of m PAM parameter space relevant to model T-cell data. Copyright © 2013 Elsevier Ltd. All rights reserved.
Model Based Predictive Control of Multivariable Hammerstein Processes with Fuzzy Logic Hypercube Interpolated Models

PubMed Central

Coelho, Antonio Augusto Rodrigues

2016-01-01

This paper introduces the Fuzzy Logic Hypercube Interpolator (FLHI) and demonstrates applications in control of multiple-input single-output (MISO) and multiple-input multiple-output (MIMO) processes with Hammerstein nonlinearities. FLHI consists of a Takagi-Sugeno fuzzy inference system where membership functions act as kernel functions of an interpolator. Conjunction of membership functions in an unitary hypercube space enables multivariable interpolation of N-dimensions. Membership functions act as interpolation kernels, such that choice of membership functions determines interpolation characteristics, allowing FLHI to behave as a nearest-neighbor, linear, cubic, spline or Lanczos interpolator, to name a few. The proposed interpolator is presented as a solution to the modeling problem of static nonlinearities since it is capable of modeling both a function and its inverse function. Three study cases from literature are presented, a single-input single-output (SISO) system, a MISO and a MIMO system. Good results are obtained regarding performance metrics such as set-point tracking, control variation and robustness. Results demonstrate applicability of the proposed method in modeling Hammerstein nonlinearities and their inverse functions for implementation of an output compensator with Model Based Predictive Control (MBPC), in particular Dynamic Matrix Control (DMC). PMID:27657723
Novel active contour model based on multi-variate local Gaussian distribution for local segmentation of MR brain images

NASA Astrophysics Data System (ADS)

Zheng, Qiang; Li, Honglun; Fan, Baode; Wu, Shuanhu; Xu, Jindong

2017-12-01

Active contour model (ACM) has been one of the most widely utilized methods in magnetic resonance (MR) brain image segmentation because of its ability of capturing topology changes. However, most of the existing ACMs only consider single-slice information in MR brain image data, i.e., the information used in ACMs based segmentation method is extracted only from one slice of MR brain image, which cannot take full advantage of the adjacent slice images' information, and cannot satisfy the local segmentation of MR brain images. In this paper, a novel ACM is proposed to solve the problem discussed above, which is based on multi-variate local Gaussian distribution and combines the adjacent slice images' information in MR brain image data to satisfy segmentation. The segmentation is finally achieved through maximizing the likelihood estimation. Experiments demonstrate the advantages of the proposed ACM over the single-slice ACM in local segmentation of MR brain image series.
A revision of chiggers of the minuta species-group (Acari: Trombiculidae: Neotrombicula Hirst, 1925) using multivariate morphometrics.

PubMed

Stekolnikov, Alexandr A; Klimov, Pavel B

2010-09-01

We revise chiggers belonging to the minuta-species group (genus Neotrombicula Hirst, 1925) from the Palaearctic using size-free multivariate morphometrics. This approach allowed us to resolve several diagnostic problems. We show that the widely distributed Neotrombicula scrupulosa Kudryashova, 1993 forms three spatially and ecologically isolated groups different from each other in size or shape (morphometric property) only: specimens from the Caucasus are distinct from those from Asia in shape, whereas the Asian specimens from plains and mountains are different from each other in size. We developed a multivariate classification model to separate three closely related species: N. scrupulosa, N. lubrica Kudryashova, 1993 and N. minuta Schluger, 1966. This model is based on five shape variables selected from an initial 17 variables by a best subset analysis using a custom size-correction subroutine. The variable selection procedure slightly improved the predictive power of the model, suggesting that it not only removed redundancy but also reduced 'noise' in the dataset. The overall classification accuracy of this model is 96.2, 96.2 and 95.5%, as estimated by internal validation, external validation and jackknife statistics, respectively. Our analyses resulted in one new synonymy: N. dimidiata Stekolnikov, 1995 is considered to be a synonym of N. lubrica. Both N. scrupulosa and N. lubrica are recorded from new localities. A key to species of the minuta-group incorporating results from our multivariate analyses is presented.
High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

PubMed Central

Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike

2010-01-01

We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well as scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric components for latent structure, underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway studies. Supplementary supporting material provides more details of the applications, as well as examples of the use of freely available software tools for implementing the methodology. PMID:21218139
Estimation of value at risk in currency exchange rate portfolio using asymmetric GJR-GARCH Copula

NASA Astrophysics Data System (ADS)

Nurrahmat, Mohamad Husein; Noviyanti, Lienda; Bachrudin, Achmad

2017-03-01

In this study, we discuss the problem in measuring the risk in a portfolio based on value at risk (VaR) using asymmetric GJR-GARCH Copula. The approach based on the consideration that the assumption of normality over time for the return can not be fulfilled, and there is non-linear correlation for dependent model structure among the variables that lead to the estimated VaR be inaccurate. Moreover, the leverage effect also causes the asymmetric effect of dynamic variance and shows the weakness of the GARCH models due to its symmetrical effect on conditional variance. Asymmetric GJR-GARCH models are used to filter the margins while the Copulas are used to link them together into a multivariate distribution. Then, we use copulas to construct flexible multivariate distributions with different marginal and dependence structure, which is led to portfolio joint distribution does not depend on the assumptions of normality and linear correlation. VaR obtained by the analysis with confidence level 95% is 0.005586. This VaR derived from the best Copula model, t-student Copula with marginal distribution of t distribution.
A multivariable model for predicting the frictional behaviour and hydration of the human skin.

PubMed

Veijgen, N K; van der Heide, E; Masen, M A

2013-08-01

The frictional characteristics of skin-object interactions are important when handling objects, in the assessment of perception and comfort of products and materials and in the origins and prevention of skin injuries. In this study, based on statistical methods, a quantitative model is developed that describes the friction behaviour of human skin as a function of the subject characteristics, contact conditions, the properties of the counter material as well as environmental conditions. Although the frictional behaviour of human skin is a multivariable problem, in literature the variables that are associated with skin friction have been studied using univariable methods. In this work, multivariable models for the static and dynamic coefficients of friction as well as for the hydration of the skin are presented. A total of 634 skin-friction measurements were performed using a recently developed tribometer. Using a statistical analysis, previously defined potential influential variables were linked to the static and dynamic coefficient of friction and to the hydration of the skin, resulting in three predictive quantitative models that descibe the friction behaviour and the hydration of human skin respectively. Increased dynamic coefficients of friction were obtained from older subjects, on the index finger, with materials with a higher surface energy at higher room temperatures, whereas lower dynamic coefficients of friction were obtained at lower skin temperatures, on the temple with rougher contact materials. The static coefficient of friction increased with higher skin hydration, increasing age, on the index finger, with materials with a higher surface energy and at higher ambient temperatures. The hydration of the skin was associated with the skin temperature, anatomical location, presence of hair on the skin and the relative air humidity. Predictive models have been derived for the static and dynamic coefficient of friction using a multivariable approach. These two coefficients of friction show a strong correlation. Consequently the two multivariable models resemble, with the static coefficient of friction being on average 18% lower than the dynamic coefficient of friction. The multivariable models in this study can be used to describe the data set that was the basis for this study. Care should be taken when generalising these results. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Applying quantitative adiposity feature analysis models to predict benefit of bevacizumab-based chemotherapy in ovarian cancer patients

NASA Astrophysics Data System (ADS)

Wang, Yunzhi; Qiu, Yuchen; Thai, Theresa; More, Kathleen; Ding, Kai; Liu, Hong; Zheng, Bin

2016-03-01

How to rationally identify epithelial ovarian cancer (EOC) patients who will benefit from bevacizumab or other antiangiogenic therapies is a critical issue in EOC treatments. The motivation of this study is to quantitatively measure adiposity features from CT images and investigate the feasibility of predicting potential benefit of EOC patients with or without receiving bevacizumab-based chemotherapy treatment using multivariate statistical models built based on quantitative adiposity image features. A dataset involving CT images from 59 advanced EOC patients were included. Among them, 32 patients received maintenance bevacizumab after primary chemotherapy and the remaining 27 patients did not. We developed a computer-aided detection (CAD) scheme to automatically segment subcutaneous fat areas (VFA) and visceral fat areas (SFA) and then extracted 7 adiposity-related quantitative features. Three multivariate data analysis models (linear regression, logistic regression and Cox proportional hazards regression) were performed respectively to investigate the potential association between the model-generated prediction results and the patients' progression-free survival (PFS) and overall survival (OS). The results show that using all 3 statistical models, a statistically significant association was detected between the model-generated results and both of the two clinical outcomes in the group of patients receiving maintenance bevacizumab (p<0.01), while there were no significant association for both PFS and OS in the group of patients without receiving maintenance bevacizumab. Therefore, this study demonstrated the feasibility of using quantitative adiposity-related CT image features based statistical prediction models to generate a new clinical marker and predict the clinical outcome of EOC patients receiving maintenance bevacizumab-based chemotherapy.
A multivariate regression model for detection of fumonisins content in maize from near infrared spectra.

PubMed

Giacomo, Della Riccia; Stefania, Del Zotto

2013-12-15

Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. Copyright © 2013 Elsevier Ltd. All rights reserved.
Influence assessment in censored mixed-effects models using the multivariate Student’s-t distribution

PubMed Central

Matos, Larissa A.; Bandyopadhyay, Dipankar; Castro, Luis M.; Lachos, Victor H.

2015-01-01

In biomedical studies on HIV RNA dynamics, viral loads generate repeated measures that are often subjected to upper and lower detection limits, and hence these responses are either left- or right-censored. Linear and non-linear mixed-effects censored (LMEC/NLMEC) models are routinely used to analyse these longitudinal data, with normality assumptions for the random effects and residual errors. However, the derived inference may not be robust when these underlying normality assumptions are questionable, especially the presence of outliers and thick-tails. Motivated by this, Matos et al. (2013b) recently proposed an exact EM-type algorithm for LMEC/NLMEC models using a multivariate Student’s-t distribution, with closed-form expressions at the E-step. In this paper, we develop influence diagnostics for LMEC/NLMEC models using the multivariate Student’s-t density, based on the conditional expectation of the complete data log-likelihood. This partially eliminates the complexity associated with the approach of Cook (1977, 1986) for censored mixed-effects models. The new methodology is illustrated via an application to a longitudinal HIV dataset. In addition, a simulation study explores the accuracy of the proposed measures in detecting possible influential observations for heavy-tailed censored data under different perturbation and censoring schemes. PMID:26190871
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

PubMed

Ma, Yan; Mazumdar, Madhu

2011-10-30

Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Parametric Cost Models for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Henrichs, Todd; Dollinger, Courtney

2010-01-01

Multivariable parametric cost models for space telescopes provide several benefits to designers and space system project managers. They identify major architectural cost drivers and allow high-level design trades. They enable cost-benefit analysis for technology development investment. And, they provide a basis for estimating total project cost. A survey of historical models found that there is no definitive space telescope cost model. In fact, published models vary greatly [1]. Thus, there is a need for parametric space telescopes cost models. An effort is underway to develop single variable [2] and multi-variable [3] parametric space telescope cost models based on the latest available data and applying rigorous analytical techniques. Specific cost estimating relationships (CERs) have been developed which show that aperture diameter is the primary cost driver for large space telescopes; technology development as a function of time reduces cost at the rate of 50% per 17 years; it costs less per square meter of collecting aperture to build a large telescope than a small telescope; and increasing mass reduces cost.
Is a multivariate consensus representation of genetic relationships among populations always meaningful?

PubMed Central

Moazami-Goudarzi, K; Laloë, D

2002-01-01

To determine the relationships among closely related populations or species, two methods are commonly used in the literature: phylogenetic reconstruction or multivariate analysis. The aim of this article is to assess the reliability of multivariate analysis. We describe a method that is based on principal component analysis and Mantel correlations, using a two-step process: The first step consists of a single-marker analysis and the second step tests if each marker reveals the same typology concerning population differentiation. We conclude that if single markers are not congruent, the compromise structure is not meaningful. Our model is not based on any particular mutation process and it can be applied to most of the commonly used genetic markers. This method is also useful to determine the contribution of each marker to the typology of populations. We test whether our method is efficient with two real data sets based on microsatellite markers. Our analysis suggests that for closely related populations, it is not always possible to accept the hypothesis that an increase in the number of markers will increase the reliability of the typology analysis. PMID:12242255
Nurses' decision making in heart failure management based on heart failure certification status.

PubMed

Albert, Nancy M; Bena, James F; Buxbaum, Denise; Martensen, Linda; Morrison, Shannon L; Prasun, Marilyn A; Stamp, Kelly D

Research findings on the value of nurse certification were based on subjective perceptions or biased by correlations of certification status and global clinical factors. In heart failure, the value of certification is unknown. Examine the value of certification based nurses' decision-making. Cross-sectional study of nurses who completed heart failure clinical vignettes that reflected decision-making in clinical heart failure scenarios. Statistical tests included multivariable linear, logistic and proportional odds logistic regression models. Of nurses (N = 605), 29.1% were heart failure certified, 35.0% were certified in another specialty/job role and 35.9% were not certified. In multivariable modeling, nurses certified in heart failure (versus not heart failure certified) had higher clinical vignette scores (p = 0.002), reflecting higher evidence-based decision making; nurses with another specialty/role certification (versus no certification) did not (p = 0.62). Heart failure certification, but not in other specialty/job roles was associated with decisions that reflected delivery of high-quality care. Copyright © 2018 Elsevier Inc. All rights reserved.
Multivariate Models of Parent-Late Adolescent Gender Dyads: The Importance of Parenting Processes in Predicting Adjustment

ERIC Educational Resources Information Center

McKinney, Cliff; Renk, Kimberly

2008-01-01

Although parent-adolescent interactions have been examined, relevant variables have not been integrated into a multivariate model. As a result, this study examined a multivariate model of parent-late adolescent gender dyads in an attempt to capture important predictors in late adolescents' important and unique transition to adulthood. The sample…
Cellulose I crystallinity determination using FT-Raman spectroscopy : univariate and multivariate methods

Treesearch

Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph

2010-01-01

Two new methods based on FTâRaman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...
Comparison between splines and fractional polynomials for multivariable model building with continuous covariates: a simulation study with continuous response.

PubMed

Binder, Harald; Sauerbrei, Willi; Royston, Patrick

2013-06-15

In observational studies, many continuous or categorical covariates may be related to an outcome. Various spline-based procedures or the multivariable fractional polynomial (MFP) procedure can be used to identify important variables and functional forms for continuous covariates. This is the main aim of an explanatory model, as opposed to a model only for prediction. The type of analysis often guides the complexity of the final model. Spline-based procedures and MFP have tuning parameters for choosing the required complexity. To compare model selection approaches, we perform a simulation study in the linear regression context based on a data structure intended to reflect realistic biomedical data. We vary the sample size, variance explained and complexity parameters for model selection. We consider 15 variables. A sample size of 200 (1000) and R(2) = 0.2 (0.8) is the scenario with the smallest (largest) amount of information. For assessing performance, we consider prediction error, correct and incorrect inclusion of covariates, qualitative measures for judging selected functional forms and further novel criteria. From limited information, a suitable explanatory model cannot be obtained. Prediction performance from all types of models is similar. With a medium amount of information, MFP performs better than splines on several criteria. MFP better recovers simpler functions, whereas splines better recover more complex functions. For a large amount of information and no local structure, MFP and the spline procedures often select similar explanatory models. Copyright © 2012 John Wiley & Sons, Ltd.
Nonparametric Bayesian Segmentation of a Multivariate Inhomogeneous Space-Time Poisson Process.

PubMed

Ding, Mingtao; He, Lihan; Dunson, David; Carin, Lawrence

2012-12-01

A nonparametric Bayesian model is proposed for segmenting time-evolving multivariate spatial point process data. An inhomogeneous Poisson process is assumed, with a logistic stick-breaking process (LSBP) used to encourage piecewise-constant spatial Poisson intensities. The LSBP explicitly favors spatially contiguous segments, and infers the number of segments based on the observed data. The temporal dynamics of the segmentation and of the Poisson intensities are modeled with exponential correlation in time, implemented in the form of a first-order autoregressive model for uniformly sampled discrete data, and via a Gaussian process with an exponential kernel for general temporal sampling. We consider and compare two different inference techniques: a Markov chain Monte Carlo sampler, which has relatively high computational complexity; and an approximate and efficient variational Bayesian analysis. The model is demonstrated with a simulated example and a real example of space-time crime events in Cincinnati, Ohio, USA.

A constrained multinomial Probit route choice model in the metro network: Formulation, estimation and application

PubMed Central

Zhang, Yongsheng; Wei, Heng; Zheng, Kangning

2017-01-01

Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
Multivariate estimation of the limit of detection by orthogonal partial least squares in temperature-modulated MOX sensors.

PubMed

Burgués, Javier; Marco, Santiago

2018-08-17

Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
Preoperative predictive model of recovery of urinary continence after radical prostatectomy.

PubMed

Matsushita, Kazuhito; Kent, Matthew T; Vickers, Andrew J; von Bodman, Christian; Bernstein, Melanie; Touijer, Karim A; Coleman, Jonathan A; Laudone, Vincent T; Scardino, Peter T; Eastham, James A; Akin, Oguz; Sandhu, Jaspreet S

2015-10-01

To build a predictive model of urinary continence recovery after radical prostatectomy (RP) that incorporates magnetic resonance imaging (MRI) parameters and clinical data. We conducted a retrospective review of data from 2,849 patients who underwent pelvic staging MRI before RP from November 2001 to June 2010. We used logistic regression to evaluate the association between each MRI variable and continence at 6 or 12 months, adjusting for age, body mass index (BMI) and American Society of Anesthesiologists (ASA) score, and then used multivariable logistic regression to create our model. A nomogram was constructed using the multivariable logistic regression models. In all, 68% (1,742/2,559) and 82% (2,205/2,689) regained function at 6 and 12 months, respectively. In the base model, age, BMI and ASA score were significant predictors of continence at 6 or 12 months on univariate analysis (P < 0.005). Among the preoperative MRI measurements, membranous urethral length, which showed great significance, was incorporated into the base model to create the full model. For continence recovery at 6 months, the addition of membranous urethral length increased the area under the curve (AUC) to 0.664 for the validation set, an increase of 0.064 over the base model. For continence recovery at 12 months, the AUC was 0.674, an increase of 0.085 over the base model. Using our model, the likelihood of continence recovery increases with membranous urethral length and decreases with age, BMI and ASA score. This model could be used for patient counselling and for the identification of patients at high risk for urinary incontinence in whom to study changes in operative technique that improve urinary function after RP. © 2015 The Authors BJU International © 2015 BJU International Published by John Wiley & Sons Ltd.
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models.

PubMed

Intharathirat, Rotchana; Abdul Salam, P; Kumar, S; Untong, Akarapong

2015-05-01

In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developing countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435-44,994 tonnes per day in 2013 to 55,177-56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

PubMed Central

Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

2017-01-01

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

PubMed

Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

2017-02-14

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Development of an accelerometer-based multivariate model to predict free-living energy expenditure in a large military cohort.

PubMed

Horner, Fleur; Bilzon, James L; Rayson, Mark; Blacker, Sam; Richmond, Victoria; Carter, James; Wright, Anthony; Nevill, Alan

2013-01-01

This study developed a multivariate model to predict free-living energy expenditure (EE) in independent military cohorts. Two hundred and eighty-eight individuals (20.6 ± 3.9 years, 67.9 ± 12.0 kg, 1.71 ± 0.10 m) from 10 cohorts wore accelerometers during observation periods of 7 or 10 days. Accelerometer counts (PAC) were recorded at 1-minute epochs. Total energy expenditure (TEE) and physical activity energy expenditure (PAEE) were derived using the doubly labelled water technique. Data were reduced to n = 155 based on wear-time. Associations between PAC and EE were assessed using allometric modelling. Models were derived using multiple log-linear regression analysis and gender differences assessed using analysis of covariance. In all models PAC, height and body mass were related to TEE (P < 0.01). For models predicting TEE (r (2) = 0.65, SE = 462 kcal · d(-1) (13.0%)), PAC explained 4% of the variance. For models predicting PAEE (r (2) = 0.41, SE = 490 kcal · d(-1) (32.0%)), PAC accounted for 6% of the variance. Accelerometry increases the accuracy of EE estimation in military populations. However, the unique nature of military life means accurate prediction of individual free-living EE is highly dependent on anthropometric measurements.
Multivariate Longitudinal Analysis with Bivariate Correlation Test

PubMed Central

Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

2016-01-01

In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692
Multivariate spatial models of excess crash frequency at area level: case of Costa Rica.

PubMed

Aguero-Valverde, Jonathan

2013-10-01

Recently, areal models of crash frequency have being used in the analysis of various area-wide factors affecting road crashes. On the other hand, disease mapping methods are commonly used in epidemiology to assess the relative risk of the population at different spatial units. A natural next step is to combine these two approaches to estimate the excess crash frequency at area level as a measure of absolute crash risk. Furthermore, multivariate spatial models of crash severity are explored in order to account for both frequency and severity of crashes and control for the spatial correlation frequently found in crash data. This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency. A multivariate spatial model is used for that purpose and compared to its univariate counterpart. Full Bayes hierarchical approach is used to estimate the models of crash frequency at canton level for Costa Rica. An intrinsic multivariate conditional autoregressive model is used for modeling spatial random effects. The results show that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria. Additionally, the effects of the spatial smoothing due to the multivariate spatial random effects are evident in the estimation of excess equivalent property damage only crashes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Predicting crash frequency for multi-vehicle collision types using multivariate Poisson-lognormal spatial model: A comparative analysis.

PubMed

Hosseinpour, Mehdi; Sahebi, Sina; Zamzuri, Zamira Hasanah; Yahaya, Ahmad Shukri; Ismail, Noriszura

2018-06-01

According to crash configuration and pre-crash conditions, traffic crashes are classified into different collision types. Based on the literature, multi-vehicle crashes, such as head-on, rear-end, and angle crashes, are more frequent than single-vehicle crashes, and most often result in serious consequences. From a methodological point of view, the majority of prior studies focused on multivehicle collisions have employed univariate count models to estimate crash counts separately by collision type. However, univariate models fail to account for correlations which may exist between different collision types. Among others, multivariate Poisson lognormal (MVPLN) model with spatial correlation is a promising multivariate specification because it not only allows for unobserved heterogeneity (extra-Poisson variation) and dependencies between collision types, but also spatial correlation between adjacent sites. However, the MVPLN spatial model has rarely been applied in previous research for simultaneously modelling crash counts by collision type. Therefore, this study aims at utilizing a MVPLN spatial model to estimate crash counts for four different multi-vehicle collision types, including head-on, rear-end, angle, and sideswipe collisions. To investigate the performance of the MVPLN spatial model, a two-stage model and a univariate Poisson lognormal model (UNPLN) spatial model were also developed in this study. Detailed information on roadway characteristics, traffic volume, and crash history were collected on 407 homogeneous segments from Malaysian federal roads. The results indicate that the MVPLN spatial model outperforms the other comparing models in terms of goodness-of-fit measures. The results also show that the inclusion of spatial heterogeneity in the multivariate model significantly improves the model fit, as indicated by the Deviance Information Criterion (DIC). The correlation between crash types is high and positive, implying that the occurrence of a specific collision type is highly associated with the occurrence of other crash types on the same road segment. These results support the utilization of the MVPLN spatial model when predicting crash counts by collision manner. In terms of contributing factors, the results show that distinct crash types are attributed to different subsets of explanatory variables. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mortality Prediction Model of Septic Shock Patients Based on Routinely Recorded Data

PubMed Central

Carrara, Marta; Baselli, Giuseppe; Ferrario, Manuela

2015-01-01

We studied the problem of mortality prediction in two datasets, the first composed of 23 septic shock patients and the second composed of 73 septic subjects selected from the public database MIMIC-II. For each patient we derived hemodynamic variables, laboratory results, and clinical information of the first 48 hours after shock onset and we performed univariate and multivariate analyses to predict mortality in the following 7 days. The results show interesting features that individually identify significant differences between survivors and nonsurvivors and features which gain importance only when considered together with the others in a multivariate regression model. This preliminary study on two small septic shock populations represents a novel contribution towards new personalized models for an integration of multiparameter patient information to improve critical care management of shock patients. PMID:26557154
Neural network-based nonlinear model predictive control vs. linear quadratic gaussian control

USGS Publications Warehouse

Cho, C.; Vance, R.; Mardi, N.; Qian, Z.; Prisbrey, K.

1997-01-01

One problem with the application of neural networks to the multivariable control of mineral and extractive processes is determining whether and how to use them. The objective of this investigation was to compare neural network control to more conventional strategies and to determine if there are any advantages in using neural network control in terms of set-point tracking, rise time, settling time, disturbance rejection and other criteria. The procedure involved developing neural network controllers using both historical plant data and simulation models. Various control patterns were tried, including both inverse and direct neural network plant models. These were compared to state space controllers that are, by nature, linear. For grinding and leaching circuits, a nonlinear neural network-based model predictive control strategy was superior to a state space-based linear quadratic gaussian controller. The investigation pointed out the importance of incorporating state space into neural networks by making them recurrent, i.e., feeding certain output state variables into input nodes in the neural network. It was concluded that neural network controllers can have better disturbance rejection, set-point tracking, rise time, settling time and lower set-point overshoot, and it was also concluded that neural network controllers can be more reliable and easy to implement in complex, multivariable plants.
Heuristic-driven graph wavelet modeling of complex terrain

NASA Astrophysics Data System (ADS)

Cioacǎ, Teodor; Dumitrescu, Bogdan; Stupariu, Mihai-Sorin; Pǎtru-Stupariu, Ileana; Nǎpǎrus, Magdalena; Stoicescu, Ioana; Peringer, Alexander; Buttler, Alexandre; Golay, François

2015-03-01

We present a novel method for building a multi-resolution representation of large digital surface models. The surface points coincide with the nodes of a planar graph which can be processed using a critically sampled, invertible lifting scheme. To drive the lazy wavelet node partitioning, we employ an attribute aware cost function based on the generalized quadric error metric. The resulting algorithm can be applied to multivariate data by storing additional attributes at the graph's nodes. We discuss how the cost computation mechanism can be coupled with the lifting scheme and examine the results by evaluating the root mean square error. The algorithm is experimentally tested using two multivariate LiDAR sets representing terrain surface and vegetation structure with different sampling densities.
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction

DOE PAGES

Thomas, Edward V.; Lewis, John R.; Anderson-Cook, Christine M.; ...

2017-11-21

nverse prediction is important in a wide variety of scientific and engineering contexts. One might use inverse prediction to predict fundamental properties/characteristics of an object using measurements obtained from it. This can be accomplished by “inverting” parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are science based; but often, forward models are empirically based, using the results of experimentation. For empirically-based forward models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). While nature dictates the causal relationship between factorsmore » and responses, experimenters can influence control of the type, accuracy, and precision of forward models that can be constructed via selection of factors, factor levels, and the set of trials that are performed. Whether the forward models are based on science, experiments or both, researchers can influence the ability to perform inverse prediction by selecting informative response variables. By using an errors-in-variables framework for inverse prediction, this paper shows via simple analysis and examples how the capability of a multivariate response (with respect to being informative and discriminating) can vary depending on how well the various responses complement one another over the range of the factor-space of interest. Insights derived from this analysis could be useful for selecting a set of response variables among candidates in cases where the number of response variables that can be acquired is limited by difficulty, expense, and/or availability of material.« less
Selecting an Informative/Discriminating Multivariate Response for Inverse Prediction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomas, Edward V.; Lewis, John R.; Anderson-Cook, Christine M.

nverse prediction is important in a wide variety of scientific and engineering contexts. One might use inverse prediction to predict fundamental properties/characteristics of an object using measurements obtained from it. This can be accomplished by “inverting” parameterized forward models that relate the measurements (responses) to the properties/characteristics of interest. Sometimes forward models are science based; but often, forward models are empirically based, using the results of experimentation. For empirically-based forward models, it is important that the experiments provide a sound basis to develop accurate forward models in terms of the properties/characteristics (factors). While nature dictates the causal relationship between factorsmore » and responses, experimenters can influence control of the type, accuracy, and precision of forward models that can be constructed via selection of factors, factor levels, and the set of trials that are performed. Whether the forward models are based on science, experiments or both, researchers can influence the ability to perform inverse prediction by selecting informative response variables. By using an errors-in-variables framework for inverse prediction, this paper shows via simple analysis and examples how the capability of a multivariate response (with respect to being informative and discriminating) can vary depending on how well the various responses complement one another over the range of the factor-space of interest. Insights derived from this analysis could be useful for selecting a set of response variables among candidates in cases where the number of response variables that can be acquired is limited by difficulty, expense, and/or availability of material.« less
DEFINITION OF MULTIVARIATE GEOCHEMICAL ASSOCIATIONS WITH POLYMETALLIC MINERAL OCCURRENCES USING A SPATIALLY DEPENDENT CLUSTERING TECHNIQUE AND RASTERIZED STREAM SEDIMENT DATA - AN ALASKAN EXAMPLE.

USGS Publications Warehouse

Jenson, Susan K.; Trautwein, C.M.

1984-01-01

The application of an unsupervised, spatially dependent clustering technique (AMOEBA) to interpolated raster arrays of stream sediment data has been found to provide useful multivariate geochemical associations for modeling regional polymetallic resource potential. The technique is based on three assumptions regarding the compositional and spatial relationships of stream sediment data and their regional significance. These assumptions are: (1) compositionally separable classes exist and can be statistically distinguished; (2) the classification of multivariate data should minimize the pair probability of misclustering to establish useful compositional associations; and (3) a compositionally defined class represented by three or more contiguous cells within an array is a more important descriptor of a terrane than a class represented by spatial outliers.
Towards a Multi-scale Montecarlo Climate Emulator for Coastal Flooding and Long-Term Coastal Change Modeling: The Beautiful Problem

NASA Astrophysics Data System (ADS)

Rueda, A.; Alvarez Antolinez, J. A.; Hegermiller, C.; Serafin, K.; Anderson, D. L.; Ruggiero, P.; Barnard, P.; Erikson, L. H.; Vitousek, S.; Camus, P.; Tomas, A.; Gonzalez, M.; Mendez, F. J.

2016-02-01

Long-term coastal evolution and coastal flooding hazards are the result of the non-linear interaction of multiple oceanographic, hydrological, geological and meteorological forcings (e.g., astronomical tide, monthly mean sea level, large-scale storm surge, dynamic wave set-up, shoreline evolution, backshore erosion). Additionally, interannual variability and trends in storminess and sea level rise are climate drivers that must be considered. Moreover, the chronology of the hydraulic boundary conditions plays an important role since a collection of consecutive minor storm events can have more impact than the 100-yr return level event. Therefore, proper modeling of shoreline erosion, beach recovery and coastal flooding should consider the sequence of storms, the multivariate nature of the hydrodynamic forcings, and the different time scales of interest (seasonality, interannual and decadal variability). To address this `beautiful problem', we propose a hybrid approach that combines: (a) numerical hydrodynamic and morphodynamic models (SWAN for wave transformation, a shoreline change model, X-Beach for modeling infragravity waves and erosion of the backshore during extreme events and RFSM-EDA (Jamieson et al, 2012) for high resolution flooding of the coastal hinterland); (b) long-term data bases (observational and hindcast) of sea state parameters, astronomical tides and non-tidal residuals; and (c) statistical downscaling techniques, non-linear data mining, and extreme value models. The statistical downscaling approaches for multivariate variables are based on circulation patterns (Espejo et al., 2014), the chronology of the circulation patterns (Guanche et al, 2013) and the event hydrographs of multivariate extremes, resulting in a time-dependent climate emulator of hydraulic boundary conditions for coupled simulations of the coastal change and flooding models. ReferencesEspejo et al (2014) Spectral ocean wave climate variability based on circulation patterns, J Phys Oc, doi: 10.1175/JPO-D-13-0276.1 Guanche et al (2013) Autoregressive logistic regression applied to atmospheric circulation patterns, Clim Dyn, doi: 10.1007/s00382-013-1690-3 Jamieson et al (2012) A highly efficient 2D flood model with sub-element topography, Proc. Of the Inst Civil Eng., 165(10), 581-595
A Robust Bayesian Approach for Structural Equation Models with Missing Data

ERIC Educational Resources Information Center

Lee, Sik-Yum; Xia, Ye-Mao

2008-01-01

In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…
Controlled pattern imputation for sensitivity analysis of longitudinal binary and ordinal outcomes with nonignorable dropout.

PubMed

Tang, Yongqiang

2018-04-30

The controlled imputation method refers to a class of pattern mixture models that have been commonly used as sensitivity analyses of longitudinal clinical trials with nonignorable dropout in recent years. These pattern mixture models assume that participants in the experimental arm after dropout have similar response profiles to the control participants or have worse outcomes than otherwise similar participants who remain on the experimental treatment. In spite of its popularity, the controlled imputation has not been formally developed for longitudinal binary and ordinal outcomes partially due to the lack of a natural multivariate distribution for such endpoints. In this paper, we propose 2 approaches for implementing the controlled imputation for binary and ordinal data based respectively on the sequential logistic regression and the multivariate probit model. Efficient Markov chain Monte Carlo algorithms are developed for missing data imputation by using the monotone data augmentation technique for the sequential logistic regression and a parameter-expanded monotone data augmentation scheme for the multivariate probit model. We assess the performance of the proposed procedures by simulation and the analysis of a schizophrenia clinical trial and compare them with the fully conditional specification, last observation carried forward, and baseline observation carried forward imputation methods. Copyright © 2018 John Wiley & Sons, Ltd.
Tuning algorithms for fractional order internal model controllers for time delay processes

NASA Astrophysics Data System (ADS)

Muresan, Cristina I.; Dutta, Abhishek; Dulf, Eva H.; Pinar, Zehra; Maxim, Anca; Ionescu, Clara M.

2016-03-01

This paper presents two tuning algorithms for fractional-order internal model control (IMC) controllers for time delay processes. The two tuning algorithms are based on two specific closed-loop control configurations: the IMC control structure and the Smith predictor structure. In the latter, the equivalency between IMC and Smith predictor control structures is used to tune a fractional-order IMC controller as the primary controller of the Smith predictor structure. Fractional-order IMC controllers are designed in both cases in order to enhance the closed-loop performance and robustness of classical integer order IMC controllers. The tuning procedures are exemplified for both single-input-single-output as well as multivariable processes, described by first-order and second-order transfer functions with time delays. Different numerical examples are provided, including a general multivariable time delay process. Integer order IMC controllers are designed in each case, as well as fractional-order IMC controllers. The simulation results show that the proposed fractional-order IMC controller ensures an increased robustness to modelling uncertainties. Experimental results are also provided, for the design of a multivariable fractional-order IMC controller in a Smith predictor structure for a quadruple-tank system.

An approach to multivariable control of manipulators

NASA Technical Reports Server (NTRS)

Seraji, H.

1987-01-01

The paper presents simple schemes for multivariable control of multiple-joint robot manipulators in joint and Cartesian coordinates. The joint control scheme consists of two independent multivariable feedforward and feedback controllers. The feedforward controller is the minimal inverse of the linearized model of robot dynamics and contains only proportional-double-derivative (PD2) terms - implying feedforward from the desired position, velocity and acceleration. This controller ensures that the manipulator joint angles track any reference trajectories. The feedback controller is of proportional-integral-derivative (PID) type and is designed to achieve pole placement. This controller reduces any initial tracking error to zero as desired and also ensures that robust steady-state tracking of step-plus-exponential trajectories is achieved by the joint angles. Simple and explicit expressions of computation of the feedforward and feedback gains are obtained based on the linearized model of robot dynamics. This leads to computationally efficient schemes for either on-line gain computation or off-line gain scheduling to account for variations in the linearized robot model due to changes in the operating point. The joint control scheme is extended to direct control of the end-effector motion in Cartesian space. Simulation results are given for illustration.
[Near infrared spectroscopy based process trajectory technology and its application in monitoring and controlling of traditional Chinese medicine manufacturing process].

PubMed

Li, Wen-Long; Qu, Hai-Bin

2016-10-01

In this paper, the principle of NIRS (near infrared spectroscopy)-based process trajectory technology was introduced.The main steps of the technique include:① in-line collection of the processes spectra of different technics; ② unfolding of the 3-D process spectra;③ determination of the process trajectories and their normal limits;④ monitoring of the new batches with the established MSPC (multivariate statistical process control) models.Applications of the technology in the chemical and biological medicines were reviewed briefly. By a comprehensive introduction of our feasibility research on the monitoring of traditional Chinese medicine technical process using NIRS-based multivariate process trajectories, several important problems of the practical applications which need urgent solutions are proposed, and also the application prospect of the NIRS-based process trajectory technology is fully discussed and put forward in the end. Copyright© by the Chinese Pharmaceutical Association.
A new extranodal scoring system based on the prognostically relevant extranodal sites in diffuse large B-cell lymphoma, not otherwise specified treated with chemoimmunotherapy.

PubMed

Hwang, Hee Sang; Yoon, Dok Hyun; Suh, Cheolwon; Huh, Jooryung

2016-08-01

Extranodal involvement is a well-known prognostic factor in patients with diffuse large B-cell lymphomas (DLBCL). Nevertheless, the prognostic impact of the extranodal scoring system included in the conventional international prognostic index (IPI) has been questioned in an era where rituximab treatment has become widespread. We investigated the prognostic impacts of individual sites of extranodal involvement in 761 patients with DLBCL who received rituximab-based chemoimmunotherapy. Subsequently, we established a new extranodal scoring system based on extranodal sites, showing significant prognostic correlation, and compared this system with conventional scoring systems, such as the IPI and the National Comprehensive Cancer Network-IPI (NCCN-IPI). An internal validation procedure, using bootstrapped samples, was also performed for both univariate and multivariate models. Using multivariate analysis with a backward variable selection, we found nine extranodal sites (the liver, lung, spleen, central nervous system, bone marrow, kidney, skin, adrenal glands, and peritoneum) that remained significant for use in the final model. Our newly established extranodal scoring system, based on these sites, was better correlated with patient survival than standard scoring systems, such as the IPI and the NCCN-IPI. Internal validation by bootstrapping demonstrated an improvement in model performance of our modified extranodal scoring system. Our new extranodal scoring system, based on the prognostically relevant sites, may improve the performance of conventional prognostic models of DLBCL in the rituximab era and warrants further external validation using large study populations.
Selecting climate simulations for impact studies based on multivariate patterns of climate change.

PubMed

Mendlik, Thomas; Gobiet, Andreas

In climate change impact research it is crucial to carefully select the meteorological input for impact models. We present a method for model selection that enables the user to shrink the ensemble to a few representative members, conserving the model spread and accounting for model similarity. This is done in three steps: First, using principal component analysis for a multitude of meteorological parameters, to find common patterns of climate change within the multi-model ensemble. Second, detecting model similarities with regard to these multivariate patterns using cluster analysis. And third, sampling models from each cluster, to generate a subset of representative simulations. We present an application based on the ENSEMBLES regional multi-model ensemble with the aim to provide input for a variety of climate impact studies. We find that the two most dominant patterns of climate change relate to temperature and humidity patterns. The ensemble can be reduced from 25 to 5 simulations while still maintaining its essential characteristics. Having such a representative subset of simulations reduces computational costs for climate impact modeling and enhances the quality of the ensemble at the same time, as it prevents double-counting of dependent simulations that would lead to biased statistics. The online version of this article (doi:10.1007/s10584-015-1582-0) contains supplementary material, which is available to authorized users.
Borrowing of strength and study weights in multivariate and network meta-analysis.

PubMed

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2017-12-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis

PubMed Central

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2016-01-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
Application of Fluorescence Spectrometry With Multivariate Calibration to the Enantiomeric Recognition of Fluoxetine in Pharmaceutical Preparations.

PubMed

Poláček, Roman; Májek, Pavel; Hroboňová, Katarína; Sádecká, Jana

2016-04-01

Fluoxetine is the most prescribed antidepressant chiral drug worldwide. Its enantiomers have a different duration of serotonin inhibition. A novel simple and rapid method for determination of the enantiomeric composition of fluoxetine in pharmaceutical pills is presented. Specifically, emission, excitation, and synchronous fluorescence techniques were employed to obtain the spectral data, which with multivariate calibration methods, namely, principal component regression (PCR) and partial least square (PLS), were investigated. The chiral recognition of fluoxetine enantiomers in the presence of β-cyclodextrin was based on diastereomeric complexes. The results of the multivariate calibration modeling indicated good prediction abilities. The obtained results for tablets were compared with those from chiral HPLC and no significant differences are shown by Fisher's (F) test and Student's t-test. The smallest residuals between reference or nominal values and predicted values were achieved by multivariate calibration of synchronous fluorescence spectral data. This conclusion is supported by calculated values of the figure of merit.
How to compare cross-lagged associations in a multilevel autoregressive model.

PubMed

Schuurman, Noémi K; Ferrer, Emilio; de Boer-Sonnenschein, Mieke; Hamaker, Ellen L

2016-06-01

By modeling variables over time it is possible to investigate the Granger-causal cross-lagged associations between variables. By comparing the standardized cross-lagged coefficients, the relative strength of these associations can be evaluated in order to determine important driving forces in the dynamic system. The aim of this study was twofold: first, to illustrate the added value of a multilevel multivariate autoregressive modeling approach for investigating these associations over more traditional techniques; and second, to discuss how the coefficients of the multilevel autoregressive model should be standardized for comparing the strength of the cross-lagged associations. The hierarchical structure of multilevel multivariate autoregressive models complicates standardization, because subject-based statistics or group-based statistics can be used to standardize the coefficients, and each method may result in different conclusions. We argue that in order to make a meaningful comparison of the strength of the cross-lagged associations, the coefficients should be standardized within persons. We further illustrate the bivariate multilevel autoregressive model and the standardization of the coefficients, and we show that disregarding individual differences in dynamics can prove misleading, by means of an empirical example on experienced competence and exhaustion in persons diagnosed with burnout. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
New robust bilinear least squares method for the analysis of spectral-pH matrix data.

PubMed

Goicoechea, Héctor C; Olivieri, Alejandro C

2005-07-01

A new second-order multivariate method has been developed for the analysis of spectral-pH matrix data, based on a bilinear least-squares (BLLS) model achieving the second-order advantage and handling multiple calibration standards. A simulated Monte Carlo study of synthetic absorbance-pH data allowed comparison of the newly proposed BLLS methodology with constrained parallel factor analysis (PARAFAC) and with the combination multivariate curve resolution-alternating least-squares (MCR-ALS) technique under different conditions of sample-to-sample pH mismatch and analyte-background ratio. The results indicate an improved prediction ability for the new method. Experimental data generated by measuring absorption spectra of several calibration standards of ascorbic acid and samples of orange juice were subjected to second-order calibration analysis with PARAFAC, MCR-ALS, and the new BLLS method. The results indicate that the latter method provides the best analytical results in regard to analyte recovery in samples of complex composition requiring strict adherence to the second-order advantage. Linear dependencies appear when multivariate data are produced by using the pH or a reaction time as one of the data dimensions, posing a challenge to classical multivariate calibration models. The presently discussed algorithm is useful for these latter systems.
Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

PubMed Central

Williams, L. Keoki; Buu, Anne

2017-01-01

We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
FT-IR/ATR univariate and multivariate calibration models for in situ monitoring of sugars in complex microalgal culture media.

PubMed

Girard, Jean-Michel; Deschênes, Jean-Sébastien; Tremblay, Réjean; Gagnon, Jonathan

2013-09-01

The objective of this work is to develop a quick and simple method for the in situ monitoring of sugars in biological cultures. A new technology based on Attenuated Total Reflectance-Fourier Transform Infrared (FT-IR/ATR) spectroscopy in combination with an external light guiding fiber probe was tested, first to build predictive models from solutions of pure sugars, and secondly to use those models to monitor the sugars in the complex culture medium of mixotrophic microalgae. Quantification results from the univariate model were correlated with the total dissolved solids content (R(2)=0.74). A vector normalized multivariate model was used to proportionally quantify the different sugars present in the complex culture medium and showed a predictive accuracy of >90% for sugars representing >20% of the total. This method offers an alternative to conventional sugar monitoring assays and could be used at-line or on-line in commercial scale production systems. Copyright © 2013 Elsevier Ltd. All rights reserved.
Nonstationary multivariate modeling of cerebral autoregulation during hypercapnia.

PubMed

Kostoglou, Kyriaki; Debert, Chantel T; Poulin, Marc J; Mitsis, Georgios D

2014-05-01

We examined the time-varying characteristics of cerebral autoregulation and hemodynamics during a step hypercapnic stimulus by using recursively estimated multivariate (two-input) models which quantify the dynamic effects of mean arterial blood pressure (ABP) and end-tidal CO2 tension (PETCO2) on middle cerebral artery blood flow velocity (CBFV). Beat-to-beat values of ABP and CBFV, as well as breath-to-breath values of PETCO2 during baseline and sustained euoxic hypercapnia were obtained in 8 female subjects. The multiple-input, single-output models used were based on the Laguerre expansion technique, and their parameters were updated using recursive least squares with multiple forgetting factors. The results reveal the presence of nonstationarities that confirm previously reported effects of hypercapnia on autoregulation, i.e. a decrease in the MABP phase lead, and suggest that the incorporation of PETCO2 as an additional model input yields less time-varying estimates of dynamic pressure autoregulation obtained from single-input (ABP-CBFV) models. Copyright © 2013 IPEM. Published by Elsevier Ltd. All rights reserved.
Multivariate dynamic Tobit models with lagged observed dependent variables: An effectiveness analysis of highway safety laws.

PubMed

Dong, Chunjiao; Xie, Kun; Zeng, Jin; Li, Xia

2018-04-01

Highway safety laws aim to influence driver behaviors so as to reduce the frequency and severity of crashes, and their outcomes. For one specific highway safety law, it would have different effects on the crashes across severities. Understanding such effects can help policy makers upgrade current laws and hence improve traffic safety. To investigate the effects of highway safety laws on crashes across severities, multivariate models are needed to account for the interdependency issues in crash counts across severities. Based on the characteristics of the dependent variables, multivariate dynamic Tobit (MVDT) models are proposed to analyze crash counts that are aggregated at the state level. Lagged observed dependent variables are incorporated into the MVDT models to account for potential temporal correlation issues in crash data. The state highway safety law related factors are used as the explanatory variables and socio-demographic and traffic factors are used as the control variables. Three models, a MVDT model with lagged observed dependent variables, a MVDT model with unobserved random variables, and a multivariate static Tobit (MVST) model are developed and compared. The results show that among the investigated models, the MVDT models with lagged observed dependent variables have the best goodness-of-fit. The findings indicate that, compared to the MVST, the MVDT models have better explanatory power and prediction accuracy. The MVDT model with lagged observed variables can better handle the stochasticity and dependency in the temporal evolution of the crash counts and the estimated values from the model are closer to the observed values. The results show that more lives could be saved if law enforcement agencies can make a sustained effort to educate the public about the importance of motorcyclists wearing helmets. Motor vehicle crash-related deaths, injuries, and property damages could be reduced if states enact laws for stricter text messaging rules, higher speeding fines, older licensing age, and stronger graduated licensing provisions. Injury and PDO crashes would be significantly reduced with stricter laws prohibiting the use of hand-held communication devices and higher fines for drunk driving. Copyright © 2018 Elsevier Ltd. All rights reserved.
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.

ERIC Educational Resources Information Center

Wood, Terry M.; Safrit, Margaret J.

1987-01-01

A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
ProbOnto: ontology and knowledge base of probability distributions.

PubMed

Swat, Maciej J; Grenon, Pierre; Wimalaratne, Sarala

2016-09-01

Probability distributions play a central role in mathematical and statistical modelling. The encoding, annotation and exchange of such models could be greatly simplified by a resource providing a common reference for the definition of probability distributions. Although some resources exist, no suitably detailed and complex ontology exists nor any database allowing programmatic access. ProbOnto, is an ontology-based knowledge base of probability distributions, featuring more than 80 uni- and multivariate distributions with their defining functions, characteristics, relationships and re-parameterization formulas. It can be used for model annotation and facilitates the encoding of distribution-based models, related functions and quantities. http://probonto.org mjswat@ebi.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution.

PubMed

Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

2017-01-01

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section.
Prediction of energy expenditure and physical activity in preschoolers

USDA-ARS?s Scientific Manuscript database

Accurate, nonintrusive, and feasible methods are needed to predict energy expenditure (EE) and physical activity (PA) levels in preschoolers. Herein, we validated cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on accelerometry and heart rate (HR) ...
A model for prediction of color change after tooth bleaching based on CIELAB color space

NASA Astrophysics Data System (ADS)

Herrera, Luis J.; Santana, Janiley; Yebra, Ana; Rivas, María. José; Pulgar, Rosa; Pérez, María. M.

2017-08-01

An experimental study aiming to develop a model based on CIELAB color space for prediction of color change after a tooth bleaching procedure is presented. Multivariate linear regression models were obtained to predict the L*, a*, b* and W* post-bleaching values using the pre-bleaching L*, a*and b*values. Moreover, univariate linear regression models were obtained to predict the variation in chroma (C*), hue angle (h°) and W*. The results demonstrated that is possible to estimate color change when using a carbamide peroxide tooth-bleaching system. The models obtained can be applied in clinic to predict the colour change after bleaching.
Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-06-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Generational Sex And HIV Risk Among Indigenous Women In A Street-Based Urban Canadian Setting

PubMed Central

Bingham, Brittany; Leo, Diane; Zhang, Ruth; Montaner, Julio

2014-01-01

In Canada, indigenous women are overrepresented among new HIV infections and street-based sex workers. Scholars suggest that Aboriginal women’s HIV risk stems from intergenerational effects of colonisation and racial policies. This research examined generational sex work involvement among Aboriginal and non-Aboriginal women and the effect on risk for HIV acquisition. The sample included 225 women in street-based sex work and enrolled in a community-based prospective cohort, in partnership with local sex work and Aboriginal community partners. Bivariate and multivariate logistic regression modeled an independent relationship between Aboriginal ancestry and generational sex work; and the impact of generational sex work on HIV infection among Aboriginal sex workers. Aboriginal women (48%) were more likely to be HIV-positive, with 34% living with HIV compared to 24% non-Aboriginal. In multivariate logistic regression model, Aboriginal women remained 3 times more likely to experience generational sex work (aOR:2.97; 95%CI:1.5,5.8). Generational sex work was significantly associated with HIV (aOR=3.01, 95%CI: 1.67–4.58) in a confounder model restricted to Aboriginal women. High prevalence of generational sex work among Aboriginal women and 3-fold increased risk for HIV infection are concerning. Policy reforms and community-based, culturally safe and trauma informed HIV prevention initiatives are required for Indigenous sex workers. PMID:24654881

Generational sex work and HIV risk among Indigenous women in a street-based urban Canadian setting.

PubMed

Bingham, Brittany; Leo, Diane; Zhang, Ruth; Montaner, Julio; Shannon, Kate

2014-01-01

In Canada, Indigenous women are over-represented among new HIV infections and street-based sex workers. Scholars suggest that Aboriginal women's HIV risk stems from intergenerational effects of colonisation and racial policies. This research examined generational sex work involvement among Aboriginal and non-Aboriginal women and the effect on risk for HIV acquisition. The sample included 225 women in street-based sex work and enrolled in a community-based prospective cohort, in partnership with local sex work and Aboriginal community partners. Bivariate and multivariate logistic regression modeled an independent relationship between Aboriginal ancestry and generational sex work and the impact of generational sex work on HIV infection among Aboriginal sex workers. Aboriginal women (48%) were more likely to be HIV-positive, with 34% living with HIV compared to 24% non-Aboriginal women. In multivariate logistic regression model, Aboriginal women remained three times more likely to experience generational sex work (AOR:2.97; 95%CI:1.5,5.8). Generational sex work was significantly associated with HIV (AOR = 3.01, 95%CI: 1.67-4.58) in a confounder model restricted to Aboriginal women. High prevalence of generational sex work among Aboriginal women and three-fold increased risk for HIV infection are concerning. Policy reforms and community-based, culturally safe and trauma informed HIV-prevention initiatives are required for Indigenous sex workers.
Empirical study of the dependence of the results of multivariable flexible survival analyses on model selection strategy.

PubMed

Binquet, C; Abrahamowicz, M; Mahboubi, A; Jooste, V; Faivre, J; Bonithon-Kopp, C; Quantin, C

2008-12-30

Flexible survival models, which avoid assumptions about hazards proportionality (PH) or linearity of continuous covariates effects, bring the issues of model selection to a new level of complexity. Each 'candidate covariate' requires inter-dependent decisions regarding (i) its inclusion in the model, and representation of its effects on the log hazard as (ii) either constant over time or time-dependent (TD) and, for continuous covariates, (iii) either loglinear or non-loglinear (NL). Moreover, 'optimal' decisions for one covariate depend on the decisions regarding others. Thus, some efficient model-building strategy is necessary.We carried out an empirical study of the impact of the model selection strategy on the estimates obtained in flexible multivariable survival analyses of prognostic factors for mortality in 273 gastric cancer patients. We used 10 different strategies to select alternative multivariable parametric as well as spline-based models, allowing flexible modeling of non-parametric (TD and/or NL) effects. We employed 5-fold cross-validation to compare the predictive ability of alternative models.All flexible models indicated significant non-linearity and changes over time in the effect of age at diagnosis. Conventional 'parametric' models suggested the lack of period effect, whereas more flexible strategies indicated a significant NL effect. Cross-validation confirmed that flexible models predicted better mortality. The resulting differences in the 'final model' selected by various strategies had also impact on the risk prediction for individual subjects.Overall, our analyses underline (a) the importance of accounting for significant non-parametric effects of covariates and (b) the need for developing accurate model selection strategies for flexible survival analyses. Copyright 2008 John Wiley & Sons, Ltd.
Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

PubMed Central

Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

2013-01-01

In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

PubMed

Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

2013-10-15

In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
A Multivariate Model and Analysis of Competitive Strategy in the U.S. Hardwood Lumber Industry

Treesearch

Robert J. Bush; Steven A. Sinclair

1991-01-01

Business-level competitive strategy in the hardwood lumber industry was modeled through the identification of strategic groups among large U.S. hardwood lumber producers. Strategy was operationalized using a measure based on the variables developed by Dess and Davis (1984). Factor and cluster analyses were used to define strategic groups along the dimensions of cost...
Numerical Simulation and Optimization of Directional Solidification Process of Single Crystal Superalloy Casting

PubMed Central

Zhang, Hang; Xu, Qingyan; Liu, Baicheng

2014-01-01

The rapid development of numerical modeling techniques has led to more accurate results in modeling metal solidification processes. In this study, the cellular automaton-finite difference (CA-FD) method was used to simulate the directional solidification (DS) process of single crystal (SX) superalloy blade samples. Experiments were carried out to validate the simulation results. Meanwhile, an intelligent model based on fuzzy control theory was built to optimize the complicate DS process. Several key parameters, such as mushy zone width and temperature difference at the cast-mold interface, were recognized as the input variables. The input variables were functioned with the multivariable fuzzy rule to get the output adjustment of withdrawal rate (v) (a key technological parameter). The multivariable fuzzy rule was built, based on the structure feature of casting, such as the relationship between section area, and the delay time of the temperature change response by changing v, and the professional experience of the operator as well. Then, the fuzzy controlling model coupled with CA-FD method could be used to optimize v in real-time during the manufacturing process. The optimized process was proven to be more flexible and adaptive for a steady and stray-grain free DS process. PMID:28788535
Discrete Fourier Transform-Based Multivariate Image Analysis: Application to Modeling of Aromatase Inhibitory Activity.

PubMed

Barigye, Stephen J; Freitas, Matheus P; Ausina, Priscila; Zancan, Patricia; Sola-Penna, Mauro; Castillo-Garit, Juan A

2018-02-12

We recently generalized the formerly alignment-dependent multivariate image analysis applied to quantitative structure-activity relationships (MIA-QSAR) method through the application of the discrete Fourier transform (DFT), allowing for its application to noncongruent and structurally diverse chemical compound data sets. Here we report the first practical application of this method in the screening of molecular entities of therapeutic interest, with human aromatase inhibitory activity as the case study. We developed an ensemble classification model based on the two-dimensional (2D) DFT MIA-QSAR descriptors, with which we screened the NCI Diversity Set V (1593 compounds) and obtained 34 chemical compounds with possible aromatase inhibitory activity. These compounds were docked into the aromatase active site, and the 10 most promising compounds were selected for in vitro experimental validation. Of these compounds, 7419 (nonsteroidal) and 89 201 (steroidal) demonstrated satisfactory antiproliferative and aromatase inhibitory activities. The obtained results suggest that the 2D-DFT MIA-QSAR method may be useful in ligand-based virtual screening of new molecular entities of therapeutic utility.
A hybrid PCA-CART-MARS-based prognostic approach of the remaining useful life for aircraft engines.

PubMed

Sánchez Lasheras, Fernando; García Nieto, Paulino José; de Cos Juez, Francisco Javier; Mayo Bayón, Ricardo; González Suárez, Victor Manuel

2015-03-23

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines.
A Hybrid PCA-CART-MARS-Based Prognostic Approach of the Remaining Useful Life for Aircraft Engines

PubMed Central

Lasheras, Fernando Sánchez; Nieto, Paulino José García; de Cos Juez, Francisco Javier; Bayón, Ricardo Mayo; Suárez, Victor Manuel González

2015-01-01

Prognostics is an engineering discipline that predicts the future health of a system. In this research work, a data-driven approach for prognostics is proposed. Indeed, the present paper describes a data-driven hybrid model for the successful prediction of the remaining useful life of aircraft engines. The approach combines the multivariate adaptive regression splines (MARS) technique with the principal component analysis (PCA), dendrograms and classification and regression trees (CARTs). Elements extracted from sensor signals are used to train this hybrid model, representing different levels of health for aircraft engines. In this way, this hybrid algorithm is used to predict the trends of these elements. Based on this fitting, one can determine the future health state of a system and estimate its remaining useful life (RUL) with accuracy. To evaluate the proposed approach, a test was carried out using aircraft engine signals collected from physical sensors (temperature, pressure, speed, fuel flow, etc.). Simulation results show that the PCA-CART-MARS-based approach can forecast faults long before they occur and can predict the RUL. The proposed hybrid model presents as its main advantage the fact that it does not require information about the previous operation states of the input variables of the engine. The performance of this model was compared with those obtained by other benchmark models (multivariate linear regression and artificial neural networks) also applied in recent years for the modeling of remaining useful life. Therefore, the PCA-CART-MARS-based approach is very promising in the field of prognostics of the RUL for aircraft engines. PMID:25806876
Rank-based methods for modeling dependence between loss triangles.

PubMed

Côté, Marie-Pier; Genest, Christian; Abdallah, Anas

2016-01-01

In order to determine the risk capital for their aggregate portfolio, property and casualty insurance companies must fit a multivariate model to the loss triangle data relating to each of their lines of business. As an inadequate choice of dependence structure may have an undesirable effect on reserve estimation, a two-stage inference strategy is proposed in this paper to assist with model selection and validation. Generalized linear models are first fitted to the margins. Standardized residuals from these models are then linked through a copula selected and validated using rank-based methods. The approach is illustrated with data from six lines of business of a large Canadian insurance company for which two hierarchical dependence models are considered, i.e., a fully nested Archimedean copula structure and a copula-based risk aggregation model.
Evaluating the predictive power of multivariate tensor-based morphometry in Alzheimer's disease progression via convex fused sparse group Lasso

NASA Astrophysics Data System (ADS)

Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha

2014-03-01

Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

ERIC Educational Resources Information Center

Ansari, Asim; Iyengar, Raghuram

2006-01-01

We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
[Predicting the outcome in severe injuries: an analysis of 2069 patients from the trauma register of the German Society of Traumatology (DGU)].

PubMed

Rixen, D; Raum, M; Bouillon, B; Schlosser, L E; Neugebauer, E

2001-03-01

On hospital admission numerous variables are documented from multiple trauma patients. The value of these variables to predict outcome are discussed controversially. The aim was the ability to initially determine the probability of death of multiple trauma patients. Thus, a multivariate probability model was developed based on data obtained from the trauma registry of the Deutsche Gesellschaft für Unfallchirurgie (DGU). On hospital admission the DGU trauma registry collects more than 30 variables prospectively. In the first step of analysis those variables were selected, that were assumed to be clinical predictors for outcome from literature. In a second step a univariate analysis of these variables was performed. For all primary variables with univariate significance in outcome prediction a multivariate logistic regression was performed in the third step and a multivariate prognostic model was developed. 2069 patients from 20 hospitals were prospectively included in the trauma registry from 01.01.1993-31.12.1997 (age 39 +/- 19 years; 70.0% males; ISS 22 +/- 13; 18.6% lethality). From more than 30 initially documented variables, the age, the GCS, the ISS, the base excess (BE) and the prothrombin time were the most important prognostic factors to predict the probability of death (P(death)). The following prognostic model was developed: P(death) = 1/1 + e(-[k + beta 1(age) + beta 2(GCS) + beta 3(ISS) + beta 4(BE) + beta 5(prothrombin time)]) where: k = -0.1551, beta 1 = 0.0438 with p < 0.0001, beta 2 = -0.2067 with p < 0.0001, beta 3 = 0.0252 with p = 0.0071, beta 4 = -0.0840 with p < 0.0001 and beta 5 = -0.0359 with p < 0.0001. Each of the five variables contributed significantly to the multifactorial model. These data show that the age, GCS, ISS, base excess and prothrombin time are potentially important predictors to initially identify multiple trauma patients with a high risk of lethality. With the base excess and prothrombin time value, as only variables of this multifactorial model that can be therapeutically influenced, it might be possible to better guide early and aggressive therapy.
Modeling and managing risk early in software development

NASA Technical Reports Server (NTRS)

Briand, Lionel C.; Thomas, William M.; Hetmanski, Christopher J.

1993-01-01

In order to improve the quality of the software development process, we need to be able to build empirical multivariate models based on data collectable early in the software process. These models need to be both useful for prediction and easy to interpret, so that remedial actions may be taken in order to control and optimize the development process. We present an automated modeling technique which can be used as an alternative to regression techniques. We show how it can be used to facilitate the identification and aid the interpretation of the significant trends which characterize 'high risk' components in several Ada systems. Finally, we evaluate the effectiveness of our technique based on a comparison with logistic regression based models.
Spatial extremes modeling applied to extreme precipitation data in the state of Paraná

NASA Astrophysics Data System (ADS)

Olinda, R. A.; Blanchet, J.; dos Santos, C. A. C.; Ozaki, V. A.; Ribeiro, P. J., Jr.

2014-11-01

Most of the mathematical models developed for rare events are based on probabilistic models for extremes. Although the tools for statistical modeling of univariate and multivariate extremes are well developed, the extension of these tools to model spatial extremes includes an area of very active research nowadays. A natural approach to such a modeling is the theory of extreme spatial and the max-stable process, characterized by the extension of infinite dimensions of multivariate extreme value theory, and making it possible then to incorporate the existing correlation functions in geostatistics and therefore verify the extremal dependence by means of the extreme coefficient and the Madogram. This work describes the application of such processes in modeling the spatial maximum dependence of maximum monthly rainfall from the state of Paraná, based on historical series observed in weather stations. The proposed models consider the Euclidean space and a transformation referred to as space weather, which may explain the presence of directional effects resulting from synoptic weather patterns. This method is based on the theorem proposed for de Haan and on the models of Smith and Schlather. The isotropic and anisotropic behavior of these models is also verified via Monte Carlo simulation. Estimates are made through pairwise likelihood maximum and the models are compared using the Takeuchi Information Criterion. By modeling the dependence of spatial maxima, applied to maximum monthly rainfall data from the state of Paraná, it was possible to identify directional effects resulting from meteorological phenomena, which, in turn, are important for proper management of risks and environmental disasters in countries with its economy heavily dependent on agribusiness.
Effects of Covariance Heterogeneity on Three Procedures for Analyzing Multivariate Repeated Measures Designs.

ERIC Educational Resources Information Center

Vallejo, Guillermo; Fidalgo, Angel; Fernandez, Paula

2001-01-01

Estimated empirical Type I error rate and power rate for three procedures for analyzing multivariate repeated measures designs: (1) the doubly multivariate model; (2) the Welch-James multivariate solution (H. Keselman, M. Carriere, a nd L. Lix, 1993); and (3) the multivariate version of the modified Brown-Forsythe procedure (M. Brown and A.…
Innovation Analysis | Energy Analysis | NREL

Science.gov Websites

. New empirical methods for estimating technical and commercial impact (based on patent citations and Commercial Breakthroughs, NREL employed regression models and multivariate simulations to compare social in the marketplace and found that: Web presence may provide a better representation of the commercial
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

DOE PAGES

Fierce, Laura; McGraw, Robert L.

2017-07-26

Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fierce, Laura; McGraw, Robert L.

Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Identifying the most appropriate age threshold for TNM stage grouping of well-differentiated thyroid cancer.

PubMed

Hendrickson-Rebizant, J; Sigvaldason, H; Nason, R W; Pathak, K A

2015-08-01

Age is integrated in most risk stratification systems for well-differentiated thyroid cancer (WDTC). The most appropriate age threshold for stage grouping of WDTC is debatable. The objective of this study was to evaluate the best age threshold for stage grouping by comparing multivariable models designed to evaluate the independent impact of various prognostic factors, including age based stage grouping, on the disease specific survival (DSS) of our population-based cohort. Data from population-based thyroid cancer cohort of 2125 consecutive WDTC, diagnosed during 1970-2010, with a median follow-up of 11.5 years, was used to calculate DSS using the Kaplan Meier method. Multivariable analysis with Cox proportional hazard model was used to assess independent impact of different prognostic factors on DSS. The Akaike information criterion (AIC), a measure of statistical model fit, was used to identify the most appropriate age threshold model. Delta AIC, Akaike weight, and evidence ratios were calculated to compare the relative strength of different models. The mean age of the patients was 47.3 years. DSS of the cohort was 95.6% and 92.8% at 10 and 20 years respectively. A threshold of 55 years, with the lowest AIC, was identified as the best model. Akaike weight indicated an 85% chance that this age threshold is the best among the compared models, and is 16.8 times more likely to be the best model as compared to a threshold of 45 years. The age threshold of 55 years was found to be the best for TNM stage grouping. Copyright © 2015 Elsevier Ltd. All rights reserved.

[Modeling in value-based medicine].

PubMed

Neubauer, A S; Hirneiss, C; Kampik, A

2010-03-01

Modeling plays an important role in value-based medicine (VBM). It allows decision support by predicting potential clinical and economic consequences, frequently combining different sources of evidence. Based on relevant publications and examples focusing on ophthalmology the key economic modeling methods are explained and definitions are given. The most frequently applied model types are decision trees, Markov models, and discrete event simulation (DES) models. Model validation includes besides verifying internal validity comparison with other models (external validity) and ideally validation of its predictive properties. The existing uncertainty with any modeling should be clearly stated. This is true for economic modeling in VBM as well as when using disease risk models to support clinical decisions. In economic modeling uni- and multivariate sensitivity analyses are usually applied; the key concepts here are tornado plots and cost-effectiveness acceptability curves. Given the existing uncertainty, modeling helps to make better informed decisions than without this additional information.
Multivariate Methods for Meta-Analysis of Genetic Association Studies.

PubMed

Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G

2018-01-01

Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.
An analytics approach to designing patient centered medical homes.

PubMed

Ajorlou, Saeede; Shams, Issac; Yang, Kai

2015-03-01

Recently the patient centered medical home (PCMH) model has become a popular team based approach focused on delivering more streamlined care to patients. In current practices of medical homes, a clinical based prediction frame is recommended because it can help match the portfolio capacity of PCMH teams with the actual load generated by a set of patients. Without such balances in clinical supply and demand, issues such as excessive under and over utilization of physicians, long waiting time for receiving the appropriate treatment, and non-continuity of care will eliminate many advantages of the medical home strategy. In this paper, by using the hierarchical generalized linear model with multivariate responses, we develop a clinical workload prediction model for care portfolio demands in a Bayesian framework. The model allows for heterogeneous variances and unstructured covariance matrices for nested random effects that arise through complex hierarchical care systems. We show that using a multivariate approach substantially enhances the precision of workload predictions at both primary and non primary care levels. We also demonstrate that care demands depend not only on patient demographics but also on other utilization factors, such as length of stay. Our analyses of a recent data from Veteran Health Administration further indicate that risk adjustment for patient health conditions can considerably improve the prediction power of the model.
Development of the Complex General Linear Model in the Fourier Domain: Application to fMRI Multiple Input-Output Evoked Responses for Single Subjects

PubMed Central

Rio, Daniel E.; Rawlings, Robert R.; Woltz, Lawrence A.; Gilman, Jodi; Hommer, Daniel W.

2013-01-01

A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function. PMID:23840281
Development of the complex general linear model in the Fourier domain: application to fMRI multiple input-output evoked responses for single subjects.

PubMed

Rio, Daniel E; Rawlings, Robert R; Woltz, Lawrence A; Gilman, Jodi; Hommer, Daniel W

2013-01-01

A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function.
A New Approach in Generating Meteorological Forecasts for Ensemble Streamflow Forecasting using Multivariate Functions

NASA Astrophysics Data System (ADS)

Khajehei, S.; Madadgar, S.; Moradkhani, H.

2014-12-01

The reliability and accuracy of hydrological predictions are subject to various sources of uncertainty, including meteorological forcing, initial conditions, model parameters and model structure. To reduce the total uncertainty in hydrological applications, one approach is to reduce the uncertainty in meteorological forcing by using the statistical methods based on the conditional probability density functions (pdf). However, one of the requirements for current methods is to assume the Gaussian distribution for the marginal distribution of the observed and modeled meteorology. Here we propose a Bayesian approach based on Copula functions to develop the conditional distribution of precipitation forecast needed in deriving a hydrologic model for a sub-basin in the Columbia River Basin. Copula functions are introduced as an alternative approach in capturing the uncertainties related to meteorological forcing. Copulas are multivariate joint distribution of univariate marginal distributions, which are capable to model the joint behavior of variables with any level of correlation and dependency. The method is applied to the monthly forecast of CPC with 0.25x0.25 degree resolution to reproduce the PRISM dataset over 1970-2000. Results are compared with Ensemble Pre-Processor approach as a common procedure used by National Weather Service River forecast centers in reproducing observed climatology during a ten-year verification period (2000-2010).
Feasibility Study on the Use of On-line Multivariate Statistical Process Control for Safeguards Applications in Natural Uranium Conversion Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ladd-Lively, Jennifer L

2014-01-01

The objective of this work was to determine the feasibility of using on-line multivariate statistical process control (MSPC) for safeguards applications in natural uranium conversion plants. Multivariate statistical process control is commonly used throughout industry for the detection of faults. For safeguards applications in uranium conversion plants, faults could include the diversion of intermediate products such as uranium dioxide, uranium tetrafluoride, and uranium hexafluoride. This study was limited to a 100 metric ton of uranium (MTU) per year natural uranium conversion plant (NUCP) using the wet solvent extraction method for the purification of uranium ore concentrate. A key component inmore » the multivariate statistical methodology is the Principal Component Analysis (PCA) approach for the analysis of data, development of the base case model, and evaluation of future operations. The PCA approach was implemented through the use of singular value decomposition of the data matrix where the data matrix represents normal operation of the plant. Component mole balances were used to model each of the process units in the NUCP. However, this approach could be applied to any data set. The monitoring framework developed in this research could be used to determine whether or not a diversion of material has occurred at an NUCP as part of an International Atomic Energy Agency (IAEA) safeguards system. This approach can be used to identify the key monitoring locations, as well as locations where monitoring is unimportant. Detection limits at the key monitoring locations can also be established using this technique. Several faulty scenarios were developed to test the monitoring framework after the base case or normal operating conditions of the PCA model were established. In all of the scenarios, the monitoring framework was able to detect the fault. Overall this study was successful at meeting the stated objective.« less
Multivariable normal tissue complication probability model-based treatment plan optimization for grade 2-4 dysphagia and tube feeding dependence in head and neck radiotherapy.

PubMed

Kierkels, Roel G J; Wopken, Kim; Visser, Ruurd; Korevaar, Erik W; van der Schaaf, Arjen; Bijl, Hendrik P; Langendijk, Johannes A

2016-12-01

Radiotherapy of the head and neck is challenged by the relatively large number of organs-at-risk close to the tumor. Biologically-oriented objective functions (OF) could optimally distribute the dose among the organs-at-risk. We aimed to explore OFs based on multivariable normal tissue complication probability (NTCP) models for grade 2-4 dysphagia (DYS) and tube feeding dependence (TFD). One hundred head and neck cancer patients were studied. Additional to the clinical plan, two more plans (an OF DYS and OF TFD -plan) were optimized per patient. The NTCP models included up to four dose-volume parameters and other non-dosimetric factors. A fully automatic plan optimization framework was used to optimize the OF NTCP -based plans. All OF NTCP -based plans were reviewed and classified as clinically acceptable. On average, the Δdose and ΔNTCP were small comparing the OF DYS -plan, OF TFD -plan, and clinical plan. For 5% of patients NTCP TFD reduced >5% using OF TFD -based planning compared to the OF DYS -plans. Plan optimization using NTCP DYS - and NTCP TFD -based objective functions resulted in clinically acceptable plans. For patients with considerable risk factors of TFD, the OF TFD steered the optimizer to dose distributions which directly led to slightly lower predicted NTCP TFD values as compared to the other studied plans. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
MULTIVARIATE RECEPTOR MODELS AND MODEL UNCERTAINTY. (R825173)

EPA Science Inventory

Abstract
Estimation of the number of major pollution sources, the source composition profiles, and the source contributions are the main interests in multivariate receptor modeling. Due to lack of identifiability of the receptor model, however, the estimation cannot be...
A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

PubMed Central

Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

2017-01-01

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section. PMID:28983398
QSAR modeling of cumulative environmental end-points for the prioritization of hazardous chemicals.

PubMed

Gramatica, Paola; Papa, Ester; Sangion, Alessandro

2018-01-24

The hazard of chemicals in the environment is inherently related to the molecular structure and derives simultaneously from various chemical properties/activities/reactivities. Models based on Quantitative Structure Activity Relationships (QSARs) are useful to screen, rank and prioritize chemicals that may have an adverse impact on humans and the environment. This paper reviews a selection of QSAR models (based on theoretical molecular descriptors) developed for cumulative multivariate endpoints, which were derived by mathematical combination of multiple effects and properties. The cumulative end-points provide an integrated holistic point of view to address environmentally relevant properties of chemicals.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2012-01-01

Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Two models for evaluating landslide hazards

USGS Publications Warehouse

Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.

2006-01-01

Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.
Fast-NPS-A Markov Chain Monte Carlo-based analysis tool to obtain structural information from single-molecule FRET measurements

NASA Astrophysics Data System (ADS)

Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens

2017-10-01

The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
Noise source and reactor stability estimation in a boiling water reactor using a multivariate autoregressive model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kanemoto, S.; Andoh, Y.; Sandoz, S.A.

1984-10-01

A method for evaluating reactor stability in boiling water reactors has been developed. The method is based on multivariate autoregressive (M-AR) modeling of steady-state neutron and process noise signals. In this method, two kinds of power spectral densities (PSDs) for the measured neutron signal and the corresponding noise source signal are separately identified by the M-AR modeling. The closed- and open-loop stability parameters are evaluated from these PSDs. The method is applied to actual plant noise data that were measured together with artificial perturbation test data. Stability parameters identified from noise data are compared to those from perturbation test data,more » and it is shown that both results are in good agreement. In addition to these stability estimations, driving noise sources for the neutron signal are evaluated by the M-AR modeling. Contributions from void, core flow, and pressure noise sources are quantitatively evaluated, and the void noise source is shown to be the most dominant.« less
TG study of the Li0.4Fe2.4Zn0.2O4 ferrite synthesis

NASA Astrophysics Data System (ADS)

Lysenko, E. N.; Nikolaev, E. V.; Surzhikov, A. P.

2016-02-01

In this paper, the kinetic analysis of Li-Zn ferrite synthesis was studied using thermogravimetry (TG) method through the simultaneous application of non-linear regression to several measurements run at different heating rates (multivariate non-linear regression). Using TG-curves obtained for the four heating rates and Netzsch Thermokinetics software package, the kinetic models with minimal adjustable parameters were selected to quantitatively describe the reaction of Li-Zn ferrite synthesis. It was shown that the experimental TG-curves clearly suggest a two-step process for the ferrite synthesis and therefore a model-fitting kinetic analysis based on multivariate non-linear regressions was conducted. The complex reaction was described by a two-step reaction scheme consisting of sequential reaction steps. It is established that the best results were obtained using the Yander three-dimensional diffusion model at the first stage and Ginstling-Bronstein model at the second step. The kinetic parameters for lithium-zinc ferrite synthesis reaction were found and discussed.
PharmML in Action: an Interoperable Language for Modeling and Simulation

PubMed Central

Bizzotto, R; Smith, G; Yvon, F; Kristensen, NR; Swat, MJ

2017-01-01

PharmML1 is an XML‐based exchange format2, 3, 4 created with a focus on nonlinear mixed‐effect (NLME) models used in pharmacometrics,5, 6 but providing a very general framework that also allows describing mathematical and statistical models such as single‐subject or nonlinear and multivariate regression models. This tutorial provides an overview of the structure of this language, brief suggestions on how to work with it, and use cases demonstrating its power and flexibility. PMID:28575551
A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series.

PubMed

Demanuele, Charmaine; Bähner, Florian; Plichta, Michael M; Kirsch, Peter; Tost, Heike; Meyer-Lindenberg, Andreas; Durstewitz, Daniel

2015-01-01

Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from functional magnetic resonance imaging (fMRI) blood oxygenation level dependent (BOLD) time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze (RAM) task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC), but not in the primary visual cortex (V1). Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel activity.
Evaluation of in-line Raman data for end-point determination of a coating process: Comparison of Science-Based Calibration, PLS-regression and univariate data analysis.

PubMed

Barimani, Shirin; Kleinebudde, Peter

2017-10-01

A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Predicting the multi-domain progression of Parkinson's disease: a Bayesian multivariate generalized linear mixed-effect model.

PubMed

Wang, Ming; Li, Zheng; Lee, Eun Young; Lewis, Mechelle M; Zhang, Lijun; Sterling, Nicholas W; Wagner, Daymond; Eslinger, Paul; Du, Guangwei; Huang, Xuemei

2017-09-25

It is challenging for current statistical models to predict clinical progression of Parkinson's disease (PD) because of the involvement of multi-domains and longitudinal data. Past univariate longitudinal or multivariate analyses from cross-sectional trials have limited power to predict individual outcomes or a single moment. The multivariate generalized linear mixed-effect model (GLMM) under the Bayesian framework was proposed to study multi-domain longitudinal outcomes obtained at baseline, 18-, and 36-month. The outcomes included motor, non-motor, and postural instability scores from the MDS-UPDRS, and demographic and standardized clinical data were utilized as covariates. The dynamic prediction was performed for both internal and external subjects using the samples from the posterior distributions of the parameter estimates and random effects, and also the predictive accuracy was evaluated based on the root of mean square error (RMSE), absolute bias (AB) and the area under the receiver operating characteristic (ROC) curve. First, our prediction model identified clinical data that were differentially associated with motor, non-motor, and postural stability scores. Second, the predictive accuracy of our model for the training data was assessed, and improved prediction was gained in particularly for non-motor (RMSE and AB: 2.89 and 2.20) compared to univariate analysis (RMSE and AB: 3.04 and 2.35). Third, the individual-level predictions of longitudinal trajectories for the testing data were performed, with ~80% observed values falling within the 95% credible intervals. Multivariate general mixed models hold promise to predict clinical progression of individual outcomes in PD. The data was obtained from Dr. Xuemei Huang's NIH grant R01 NS060722 , part of NINDS PD Biomarker Program (PDBP). All data was entered within 24 h of collection to the Data Management Repository (DMR), which is publically available ( https://pdbp.ninds.nih.gov/data-management ).

Managing for resilience: an information theory-based ...

EPA Pesticide Factsheets

Ecosystems are complex and multivariate; hence, methods to assess the dynamics of ecosystems should have the capacity to evaluate multiple indicators simultaneously. Most research on identifying leading indicators of regime shifts has focused on univariate methods and simple models which have limited utility when evaluating real ecosystems, particularly because drivers are often unknown. We discuss some common univariate and multivariate approaches for detecting critical transitions in ecosystems and demonstrate their capabilities via case studies. Synthesis and applications. We illustrate the utility of an information theory-based index for assessing ecosystem dynamics. Trends in this index also provide a sentinel of both abrupt and gradual transitions in ecosystems. In response to the need to identify leading indicators of regime shifts in ecosystems, our research compares traditional indicators and Fisher information, an information theory based method, by examining four case study systems. Results demonstrate the utility of methods and offers great promise for quantifying and managing for resilience.
A Java-based fMRI processing pipeline evaluation system for assessment of univariate general linear model and multivariate canonical variate analysis-based pipelines.

PubMed

Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C

2008-01-01

As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
Multivariate Normal Tissue Complication Probability Modeling of Heart Valve Dysfunction in Hodgkin Lymphoma Survivors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cella, Laura, E-mail: laura.cella@cnr.it; Department of Advanced Biomedical Sciences, Federico II University School of Medicine, Naples; Liuzzi, Raffaele

Purpose: To establish a multivariate normal tissue complication probability (NTCP) model for radiation-induced asymptomatic heart valvular defects (RVD). Methods and Materials: Fifty-six patients treated with sequential chemoradiation therapy for Hodgkin lymphoma (HL) were retrospectively reviewed for RVD events. Clinical information along with whole heart, cardiac chambers, and lung dose distribution parameters was collected, and the correlations to RVD were analyzed by means of Spearman's rank correlation coefficient (Rs). For the selection of the model order and parameters for NTCP modeling, a multivariate logistic regression method using resampling techniques (bootstrapping) was applied. Model performance was evaluated using the area under themore » receiver operating characteristic curve (AUC). Results: When we analyzed the whole heart, a 3-variable NTCP model including the maximum dose, whole heart volume, and lung volume was shown to be the optimal predictive model for RVD (Rs = 0.573, P<.001, AUC = 0.83). When we analyzed the cardiac chambers individually, for the left atrium and for the left ventricle, an NTCP model based on 3 variables including the percentage volume exceeding 30 Gy (V30), cardiac chamber volume, and lung volume was selected as the most predictive model (Rs = 0.539, P<.001, AUC = 0.83; and Rs = 0.557, P<.001, AUC = 0.82, respectively). The NTCP values increase as heart maximum dose or cardiac chambers V30 increase. They also increase with larger volumes of the heart or cardiac chambers and decrease when lung volume is larger. Conclusions: We propose logistic NTCP models for RVD considering not only heart irradiation dose but also the combined effects of lung and heart volumes. Our study establishes the statistical evidence of the indirect effect of lung size on radio-induced heart toxicity.« less
Preliminary Multi-Variable Parametric Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Hendrichs, Todd

2010-01-01

This slide presentation reviews creating a preliminary multi-variable cost model for the contract costs of making a space telescope. There is discussion of the methodology for collecting the data, definition of the statistical analysis methodology, single variable model results, testing of historical models and an introduction of the multi variable models.
Multivariate Models for Normal and Binary Responses in Intervention Studies

ERIC Educational Resources Information Center

Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen

2016-01-01

Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…
A simple rapid approach using coupled multivariate statistical methods, GIS and trajectory models to delineate areas of common oil spill risk

NASA Astrophysics Data System (ADS)

Guillen, George; Rainey, Gail; Morin, Michelle

2004-04-01

Currently, the Minerals Management Service uses the Oil Spill Risk Analysis model (OSRAM) to predict the movement of potential oil spills greater than 1000 bbl originating from offshore oil and gas facilities. OSRAM generates oil spill trajectories using meteorological and hydrological data input from either actual physical measurements or estimates generated from other hydrological models. OSRAM and many other models produce output matrices of average, maximum and minimum contact probabilities to specific landfall or target segments (columns) from oil spills at specific points (rows). Analysts and managers are often interested in identifying geographic areas or groups of facilities that pose similar risks to specific targets or groups of targets if a spill occurred. Unfortunately, due to the potentially large matrix generated by many spill models, this question is difficult to answer without the use of data reduction and visualization methods. In our study we utilized a multivariate statistical method called cluster analysis to group areas of similar risk based on potential distribution of landfall target trajectory probabilities. We also utilized ArcView™ GIS to display spill launch point groupings. The combination of GIS and multivariate statistical techniques in the post-processing of trajectory model output is a powerful tool for identifying and delineating areas of similar risk from multiple spill sources. We strongly encourage modelers, statistical and GIS software programmers to closely collaborate to produce a more seamless integration of these technologies and approaches to analyzing data. They are complimentary methods that strengthen the overall assessment of spill risks.
Solving large mixed linear models using preconditioned conjugate gradient iteration.

PubMed

Strandén, I; Lidauer, M

1999-12-01

Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
Risk factors for lower extremity injuries among half marathon and marathon runners of the Lage Landen Marathon Eindhoven 2012: A prospective cohort study in the Netherlands.

PubMed

van Poppel, D; de Koning, J; Verhagen, A P; Scholten-Peeters, G G M

2016-02-01

To determine risk factors for running injuries during the Lage Landen Marathon Eindhoven 2012. Prospective cohort study. Population-based study. This study included 943 runners. Running injuries after the Lage Landen Marathon. Sociodemographic and training-related factors as well as lifestyle factors were considered as potential risk factors and assessed in a questionnaire 1 month before the running event. The association between potential risk factors and injuries was determined, per running distance separately, using univariate and multivariate logistic regression analysis. In total, 154 respondents sustained a running injury. Among the marathon runners, in the univariate model, body mass index ≥ 26 kg/m(2), ≤ 5 years of running experience, and often performing interval training, were significantly associated with running injuries, whereas in the multivariate model only ≤ 5 years of running experience and not performing interval training on a regular basis were significantly associated with running injuries. Among marathon runners, no multivariate model could be created because of the low number of injuries and participants. This study indicates that interval training on a regular basis may be recommended to marathon runners to reduce the risk of injury. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Dynamic Granger causality based on Kalman filter for evaluation of functional network connectivity in fMRI data

PubMed Central

Havlicek, Martin; Jan, Jiri; Brazdil, Milan; Calhoun, Vince D.

2015-01-01

Increasing interest in understanding dynamic interactions of brain neural networks leads to formulation of sophisticated connectivity analysis methods. Recent studies have applied Granger causality based on standard multivariate autoregressive (MAR) modeling to assess the brain connectivity. Nevertheless, one important flaw of this commonly proposed method is that it requires the analyzed time series to be stationary, whereas such assumption is mostly violated due to the weakly nonstationary nature of functional magnetic resonance imaging (fMRI) time series. Therefore, we propose an approach to dynamic Granger causality in the frequency domain for evaluating functional network connectivity in fMRI data. The effectiveness and robustness of the dynamic approach was significantly improved by combining a forward and backward Kalman filter that improved estimates compared to the standard time-invariant MAR modeling. In our method, the functional networks were first detected by independent component analysis (ICA), a computational method for separating a multivariate signal into maximally independent components. Then the measure of Granger causality was evaluated using generalized partial directed coherence that is suitable for bivariate as well as multivariate data. Moreover, this metric provides identification of causal relation in frequency domain, which allows one to distinguish the frequency components related to the experimental paradigm. The procedure of evaluating Granger causality via dynamic MAR was demonstrated on simulated time series as well as on two sets of group fMRI data collected during an auditory sensorimotor (SM) or auditory oddball discrimination (AOD) tasks. Finally, a comparison with the results obtained from a standard time-invariant MAR model was provided. PMID:20561919
Inherited behavioral susceptibility to adiposity in infancy: a multivariate genetic analysis of appetite and weight in the Gemini birth cohort.

PubMed

Llewellyn, Clare H; van Jaarsveld, Cornelia H M; Plomin, Robert; Fisher, Abigail; Wardle, Jane

2012-03-01

The behavioral susceptibility model proposes that inherited differences in traits such as appetite confer differential risk of weight gain and contribute to the heritability of weight. Evidence that the FTO gene may influence weight partly through its effects on appetite supports this model, but testing the behavioral pathways for multiple genes with very small effects is not feasible. Twin analyses make it possible to get a broad-based estimate of the extent of shared genetic influence between appetite and weight. The objective was to use multivariate twin analyses to test the hypothesis that associations between appetite and weight are underpinned by shared genetic effects. Data were from Gemini, a population-based birth cohort of twins (n = 4804) born in 2007. Infant weights at 3 mo were taken from the records of health professionals. Appetite was assessed at 3 mo for the milk-feeding period by using the Baby Eating Behaviour Questionnaire (BEBQ), a parent-reported measure of appetite [enjoyment of food, food responsiveness, slowness in eating (SE), satiety responsiveness (SR), and appetite size (AS)]. Multivariate quantitative genetic modeling was used to test for shared genetic influences. Significant correlations were found between all BEBQ traits and weight. Significant shared genetic influence was identified for weight with SE, SR, and AS; genetic correlations were between 0.22 and 0.37. Shared genetic effects explained 41-45% of these phenotypic associations. Differences in weight in infancy may be due partly to genetically determined differences in appetitive traits that confer differential susceptibility to obesogenic environments.
Discerning mild cognitive impairment and Alzheimer Disease from normal aging: morphologic characterization based on univariate and multivariate models.

PubMed

Liao, Weiqi; Long, Xiaojing; Jiang, Chunxiang; Diao, Yanjun; Liu, Xin; Zheng, Hairong; Zhang, Lijuan

2014-05-01

Differentiating mild cognitive impairment (MCI) and Alzheimer Disease (AD) from healthy aging remains challenging. This study aimed to explore the cerebral structural alterations of subjects with MCI or AD as compared to healthy elderly based on the individual and collective effects of cerebral morphologic indices using univariate and multivariate analyses. T1-weighted images (T1WIs) were retrieved from Alzheimer Disease Neuroimaging Initiative database for 116 subjects who were categorized into groups of healthy aging, MCI, and AD. Analysis of covariance (ANCOVA) and multivariate analysis of covariance (MANCOVA) were performed to explore the intergroup morphologic alterations indexed by surface area, curvature index, cortical thickness, and subjacent white matter volume with age and sex controlled as covariates, in 34 parcellated gyri regions of interest (ROIs) for both cerebral hemispheres based on the T1WI. Statistical parameters were mapped on the anatomic images to facilitate visual inspection. Global rather than region-specific structural alterations were revealed in groups of MCI and AD relative to healthy elderly using MANCOVA. ANCOVA revealed that the cortical thickness decreased more prominently in entorhinal, temporal, and cingulate cortices and was positively correlated with patients' cognitive performance in AD group but not in MCI. The temporal lobe features marked atrophy of white matter during the disease dynamics. Significant intercorrelations were observed among the morphologic indices with univariate analysis for given ROIs. Significant global structural alterations were identified in MCI and AD based on MANCOVA model with improved sensitivity. The intercorrelation among the morphologic indices may dampen the use of individual morphological parameter in featuring cerebral structural alterations. Decrease in cortical thickness is not reflective of the cognitive performance at the early stage of AD. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.
Bayesian experimental design for models with intractable likelihoods.

PubMed

Drovandi, Christopher C; Pettitt, Anthony N

2013-12-01

In this paper we present a methodology for designing experiments for efficiently estimating the parameters of models with computationally intractable likelihoods. The approach combines a commonly used methodology for robust experimental design, based on Markov chain Monte Carlo sampling, with approximate Bayesian computation (ABC) to ensure that no likelihood evaluations are required. The utility function considered for precise parameter estimation is based upon the precision of the ABC posterior distribution, which we form efficiently via the ABC rejection algorithm based on pre-computed model simulations. Our focus is on stochastic models and, in particular, we investigate the methodology for Markov process models of epidemics and macroparasite population evolution. The macroparasite example involves a multivariate process and we assess the loss of information from not observing all variables. © 2013, The International Biometric Society.
Post-processing of multi-hydrologic model simulations for improved streamflow projections

NASA Astrophysics Data System (ADS)

khajehei, sepideh; Ahmadalipour, Ali; Moradkhani, Hamid

2016-04-01

Hydrologic model outputs are prone to bias and uncertainty due to knowledge deficiency in model and data. Uncertainty in hydroclimatic projections arises due to uncertainty in hydrologic model as well as the epistemic or aleatory uncertainties in GCM parameterization and development. This study is conducted to: 1) evaluate the recently developed multi-variate post-processing method for historical simulations and 2) assess the effect of post-processing on uncertainty and reliability of future streamflow projections in both high-flow and low-flow conditions. The first objective is performed for historical period of 1970-1999. Future streamflow projections are generated for 10 statistically downscaled GCMs from two widely used downscaling methods: Bias Corrected Statistically Downscaled (BCSD) and Multivariate Adaptive Constructed Analogs (MACA), over the period of 2010-2099 for two representative concentration pathways of RCP4.5 and RCP8.5. Three semi-distributed hydrologic models were employed and calibrated at 1/16 degree latitude-longitude resolution for over 100 points across the Columbia River Basin (CRB) in the pacific northwest USA. Streamflow outputs are post-processed through a Bayesian framework based on copula functions. The post-processing approach is relying on a transfer function developed based on bivariate joint distribution between the observation and simulation in historical period. Results show that application of post-processing technique leads to considerably higher accuracy in historical simulations and also reducing model uncertainty in future streamflow projections.
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

USGS Publications Warehouse

Anderson, Ryan; Clegg, Samuel M.; Frydenvang, Jens; Wiens, Roger C.; McLennan, Scott M.; Morris, Richard V.; Ehlmann, Bethany L.; Dyar, M. Darby

2017-01-01

Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.
Prostate Health Index improves multivariable risk prediction of aggressive prostate cancer.

PubMed

Loeb, Stacy; Shin, Sanghyuk S; Broyles, Dennis L; Wei, John T; Sanda, Martin; Klee, George; Partin, Alan W; Sokoll, Lori; Chan, Daniel W; Bangma, Chris H; van Schaik, Ron H N; Slawin, Kevin M; Marks, Leonard S; Catalona, William J

2017-07-01

To examine the use of the Prostate Health Index (PHI) as a continuous variable in multivariable risk assessment for aggressive prostate cancer in a large multicentre US study. The study population included 728 men, with prostate-specific antigen (PSA) levels of 2-10 ng/mL and a negative digital rectal examination, enrolled in a prospective, multi-site early detection trial. The primary endpoint was aggressive prostate cancer, defined as biopsy Gleason score ≥7. First, we evaluated whether the addition of PHI improves the performance of currently available risk calculators (the Prostate Cancer Prevention Trial [PCPT] and European Randomised Study of Screening for Prostate Cancer [ERSPC] risk calculators). We also designed and internally validated a new PHI-based multivariable predictive model, and created a nomogram. Of 728 men undergoing biopsy, 118 (16.2%) had aggressive prostate cancer. The PHI predicted the risk of aggressive prostate cancer across the spectrum of values. Adding PHI significantly improved the predictive accuracy of the PCPT and ERSPC risk calculators for aggressive disease. A new model was created using age, previous biopsy, prostate volume, PSA and PHI, with an area under the curve of 0.746. The bootstrap-corrected model showed good calibration with observed risk for aggressive prostate cancer and had net benefit on decision-curve analysis. Using PHI as part of multivariable risk assessment leads to a significant improvement in the detection of aggressive prostate cancer, potentially reducing harms from unnecessary prostate biopsy and overdiagnosis. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
Generic Raman-based calibration models enabling real-time monitoring of cell culture bioreactors.

PubMed

Mehdizadeh, Hamidreza; Lauri, David; Karry, Krizia M; Moshgbar, Mojgan; Procopio-Melino, Renee; Drapeau, Denis

2015-01-01

Raman-based multivariate calibration models have been developed for real-time in situ monitoring of multiple process parameters within cell culture bioreactors. Developed models are generic, in the sense that they are applicable to various products, media, and cell lines based on Chinese Hamster Ovarian (CHO) host cells, and are scalable to large pilot and manufacturing scales. Several batches using different CHO-based cell lines and corresponding proprietary media and process conditions have been used to generate calibration datasets, and models have been validated using independent datasets from separate batch runs. All models have been validated to be generic and capable of predicting process parameters with acceptable accuracy. The developed models allow monitoring multiple key bioprocess metabolic variables, and hence can be utilized as an important enabling tool for Quality by Design approaches which are strongly supported by the U.S. Food and Drug Administration. © 2015 American Institute of Chemical Engineers.
Improving the realism of hydrologic model through multivariate parameter estimation

NASA Astrophysics Data System (ADS)

Rakovec, Oldrich; Kumar, Rohini; Attinger, Sabine; Samaniego, Luis

2017-04-01

Increased availability and quality of near real-time observations should improve understanding of predictive skills of hydrological models. Recent studies have shown the limited capability of river discharge data alone to adequately constrain different components of distributed model parameterizations. In this study, the GRACE satellite-based total water storage (TWS) anomaly is used to complement the discharge data with an aim to improve the fidelity of mesoscale hydrologic model (mHM) through multivariate parameter estimation. The study is conducted in 83 European basins covering a wide range of hydro-climatic regimes. The model parameterization complemented with the TWS anomalies leads to statistically significant improvements in (1) discharge simulations during low-flow period, and (2) evapotranspiration estimates which are evaluated against independent (FLUXNET) data. Overall, there is no significant deterioration in model performance for the discharge simulations when complemented by information from the TWS anomalies. However, considerable changes in the partitioning of precipitation into runoff components are noticed by in-/exclusion of TWS during the parameter estimation. A cross-validation test carried out to assess the transferability and robustness of the calibrated parameters to other locations further confirms the benefit of complementary TWS data. In particular, the evapotranspiration estimates show more robust performance when TWS data are incorporated during the parameter estimation, in comparison with the benchmark model constrained against discharge only. This study highlights the value for incorporating multiple data sources during parameter estimation to improve the overall realism of hydrologic model and its applications over large domains. Rakovec, O., Kumar, R., Attinger, S. and Samaniego, L. (2016): Improving the realism of hydrologic model functioning through multivariate parameter estimation. Water Resour. Res., 52, http://dx.doi.org/10.1002/2016WR019430
A Multivariate Model of Factors Influencing Technology Use by Preservice Teachers during Practice Teaching

ERIC Educational Resources Information Center

Liu, Shih-Hsiung

2012-01-01

Teacher education courses training and participating in school-based field practice are important processes for equipping preservice teachers with technology integration ability. However, preservice teachers still lack the ability and knowledge needed to teach successfully with technology. This paper investigates the significance of, and…
Robustness results in LQG based multivariable control designs

NASA Technical Reports Server (NTRS)

Lehtomaki, N. A.; Sandell, N. R., Jr.; Athans, M.

1980-01-01

The robustness of control systems with respect to model uncertainty is considered using simple frequency domain criteria. Results are derived under a common framework in which the minimum singular value of the return difference transfer matrix is the key quantity. In particular, the LQ and LQG robustness results are discussed.
Modeling an Outbreak of Anthrax

ERIC Educational Resources Information Center

Sturdivant, Rod; Watts, Krista

2010-01-01

This article presents material that has been used as a classroom activity in a calculus-based probability and statistics course. The application was used in the first few lessons of this course. Students had three previous semesters of math, including calculus (single and multivariable), differential equations, and a course in mathematical…

Physical Violence between Siblings: A Theoretical and Empirical Analysis

ERIC Educational Resources Information Center

Hoffman, Kristi L.; Kiecolt, K. Jill; Edwards, John N.

2005-01-01

This study develops and tests a theoretical model to explain sibling violence based on the feminist, conflict, and social learning theoretical perspectives and research in psychology and sociology. A multivariate analysis of data from 651 young adults generally supports hypotheses from all three theoretical perspectives. Males with brothers have…
Multivariate quantile mapping bias correction: an N-dimensional probability density function transform for climate model simulations of multiple variables

NASA Astrophysics Data System (ADS)

Cannon, Alex J.

2018-01-01

Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.
Partial Least Squares Calibration Modeling Towards the Multivariate Limit of Detection for Enriched Isotopic Mixtures via Laser Ablation Molecular Isotopic Spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harris, Candace; Profeta, Luisa; Akpovo, Codjo

The psuedo univariate limit of detection was calculated to compare to the multivariate interval. ompared with results from the psuedounivariate LOD, the multivariate LOD includes other factors (i.e. signal uncertainties) and the reveals the significance in creating models that not only use the analyte’s emission line but also its entire molecular spectra.
A simplified parsimonious higher order multivariate Markov chain model

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, a simplified parsimonious higher-order multivariate Markov chain model (SPHOMMCM) is presented. Moreover, parameter estimation method of TPHOMMCM is give. Numerical experiments shows the effectiveness of TPHOMMCM.
Clustering of change patterns using Fourier coefficients.

PubMed

Kim, Jaehee; Kim, Haseong

2008-01-15

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.
Texture-Based Correspondence Display

NASA Technical Reports Server (NTRS)

Gerald-Yamasaki, Michael

2004-01-01

Texture-based correspondence display is a methodology to display corresponding data elements in visual representations of complex multidimensional, multivariate data. Texture is utilized as a persistent medium to contain a visual representation model and as a means to create multiple renditions of data where color is used to identify correspondence. Corresponding data elements are displayed over a variety of visual metaphors in a normal rendering process without adding extraneous linking metadata creation and maintenance. The effectiveness of visual representation for understanding data is extended to the expression of the visual representation model in texture.
Multivariate η-μ fading distribution with arbitrary correlation model

NASA Astrophysics Data System (ADS)

Ghareeb, Ibrahim; Atiani, Amani

2018-03-01

An extensive analysis for the multivariate ? distribution with arbitrary correlation is presented, where novel analytical expressions for the multivariate probability density function, cumulative distribution function and moment generating function (MGF) of arbitrarily correlated and not necessarily identically distributed ? power random variables are derived. Also, this paper provides exact-form expression for the MGF of the instantaneous signal-to-noise ratio at the combiner output in a diversity reception system with maximal-ratio combining and post-detection equal-gain combining operating in slow frequency nonselective arbitrarily correlated not necessarily identically distributed ?-fading channels. The average bit error probability of differentially detected quadrature phase shift keying signals with post-detection diversity reception system over arbitrarily correlated and not necessarily identical fading parameters ?-fading channels is determined by using the MGF-based approach. The effect of fading correlation between diversity branches, fading severity parameters and diversity level is studied.
Piecewise multivariate modelling of sequential metabolic profiling data.

PubMed

Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan

2008-02-19

Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
An information-based network approach for protein classification

PubMed Central

Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.

2017-01-01

Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835
Estimating suspended sediment load with multivariate adaptive regression spline, teaching-learning based optimization, and artificial bee colony models.

PubMed

Yilmaz, Banu; Aras, Egemen; Nacar, Sinan; Kankal, Murat

2018-05-23

The functional life of a dam is often determined by the rate of sediment delivery to its reservoir. Therefore, an accurate estimate of the sediment load in rivers with dams is essential for designing and predicting a dam's useful lifespan. The most credible method is direct measurements of sediment input, but this can be very costly and it cannot always be implemented at all gauging stations. In this study, we tested various regression models to estimate suspended sediment load (SSL) at two gauging stations on the Çoruh River in Turkey, including artificial bee colony (ABC), teaching-learning-based optimization algorithm (TLBO), and multivariate adaptive regression splines (MARS). These models were also compared with one another and with classical regression analyses (CRA). Streamflow values and previously collected data of SSL were used as model inputs with predicted SSL data as output. Two different training and testing dataset configurations were used to reinforce the model accuracy. For the MARS method, the root mean square error value was found to range between 35% and 39% for the test two gauging stations, which was lower than errors for other models. Error values were even lower (7% to 15%) using another dataset. Our results indicate that simultaneous measurements of streamflow with SSL provide the most effective parameter for obtaining accurate predictive models and that MARS is the most accurate model for predicting SSL. Copyright © 2017 Elsevier B.V. All rights reserved.
Sparse Multivariate Autoregressive Modeling for Mild Cognitive Impairment Classification

PubMed Central

Li, Yang; Wee, Chong-Yaw; Jie, Biao; Peng, Ziwen

2014-01-01

Brain connectivity network derived from functional magnetic resonance imaging (fMRI) is becoming increasingly prevalent in the researches related to cognitive and perceptual processes. The capability to detect causal or effective connectivity is highly desirable for understanding the cooperative nature of brain network, particularly when the ultimate goal is to obtain good performance of control-patient classification with biological meaningful interpretations. Understanding directed functional interactions between brain regions via brain connectivity network is a challenging task. Since many genetic and biomedical networks are intrinsically sparse, incorporating sparsity property into connectivity modeling can make the derived models more biologically plausible. Accordingly, we propose an effective connectivity modeling of resting-state fMRI data based on the multivariate autoregressive (MAR) modeling technique, which is widely used to characterize temporal information of dynamic systems. This MAR modeling technique allows for the identification of effective connectivity using the Granger causality concept and reducing the spurious causality connectivity in assessment of directed functional interaction from fMRI data. A forward orthogonal least squares (OLS) regression algorithm is further used to construct a sparse MAR model. By applying the proposed modeling to mild cognitive impairment (MCI) classification, we identify several most discriminative regions, including middle cingulate gyrus, posterior cingulate gyrus, lingual gyrus and caudate regions, in line with results reported in previous findings. A relatively high classification accuracy of 91.89 % is also achieved, with an increment of 5.4 % compared to the fully-connected, non-directional Pearson-correlation-based functional connectivity approach. PMID:24595922
A tridiagonal parsimonious higher order multivariate Markov chain model

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, we present a tridiagonal parsimonious higher-order multivariate Markov chain model (TPHOMMCM). Moreover, estimation method of the parameters in TPHOMMCM is give. Numerical experiments illustrate the effectiveness of TPHOMMCM.
Understanding Activity Engagement Across Weekdays and Weekend Days: A Multivariate Multiple Discrete-Continuous Modeling Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garikapati, Venu; Astroza, Sebastian; Bhat, Prerna C.

This paper is motivated by the increasing recognition that modeling activity-travel demand for a single day of the week, as is done in virtually all travel forecasting models, may be inadequate in capturing underlying processes that govern activity-travel scheduling behavior. The considerable variability in daily travel suggests that there are important complementary relationships and competing tradeoffs involved in scheduling and allocating time to various activities across days of the week. Both limited survey data availability and methodological challenges in modeling week-long activity-travel schedules have precluded the development of multi-day activity-travel demand models. With passive and technology-based data collection methods increasinglymore » in vogue, the collection of multi-day travel data may become increasingly commonplace in the years ahead. This paper addresses the methodological challenge associated with modeling multi-day activity-travel demand by formulating a multivariate multiple discrete-continuous probit (MDCP) model system. The comprehensive framework ties together two MDCP model components, one corresponding to weekday time allocation and the other to weekend activity-time allocation. By tying the two MDCP components together, the model system also captures relationships in activity-time allocation between weekdays on the one hand and weekend days on the other. Model estimation on a week-long travel diary data set from the United Kingdom shows that there are significant inter-relationships between weekdays and weekend days in activity-travel scheduling behavior. The model system presented in this paper may serve as a higher-level multi-day activity scheduler in conjunction with existing daily activity-based travel models.« less
Misspecification of Cox regression models with composite endpoints

PubMed Central

Wu, Longyang; Cook, Richard J

2012-01-01

Researchers routinely adopt composite endpoints in multicenter randomized trials designed to evaluate the effect of experimental interventions in cardiovascular disease, diabetes, and cancer. Despite their widespread use, relatively little attention has been paid to the statistical properties of estimators of treatment effect based on composite endpoints. We consider this here in the context of multivariate models for time to event data in which copula functions link marginal distributions with a proportional hazards structure. We then examine the asymptotic and empirical properties of the estimator of treatment effect arising from a Cox regression model for the time to the first event. We point out that even when the treatment effect is the same for the component events, the limiting value of the estimator based on the composite endpoint is usually inconsistent for this common value. We find that in this context the limiting value is determined by the degree of association between the events, the stochastic ordering of events, and the censoring distribution. Within the framework adopted, marginal methods for the analysis of multivariate failure time data yield consistent estimators of treatment effect and are therefore preferred. We illustrate the methods by application to a recent asthma study. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22736519
Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

PubMed

Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

2011-11-15

There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.
An open-source software package for multivariate modeling and clustering: applications to air quality management.

PubMed

Wang, Xiuquan; Huang, Guohe; Zhao, Shan; Guo, Junhong

2015-09-01

This paper presents an open-source software package, rSCA, which is developed based upon a stepwise cluster analysis method and serves as a statistical tool for modeling the relationships between multiple dependent and independent variables. The rSCA package is efficient in dealing with both continuous and discrete variables, as well as nonlinear relationships between the variables. It divides the sample sets of dependent variables into different subsets (or subclusters) through a series of cutting and merging operations based upon the theory of multivariate analysis of variance (MANOVA). The modeling results are given by a cluster tree, which includes both intermediate and leaf subclusters as well as the flow paths from the root of the tree to each leaf subcluster specified by a series of cutting and merging actions. The rSCA package is a handy and easy-to-use tool and is freely available at http://cran.r-project.org/package=rSCA . By applying the developed package to air quality management in an urban environment, we demonstrate its effectiveness in dealing with the complicated relationships among multiple variables in real-world problems.
MULTIVARIATE LINEAR MIXED MODELS FOR MULTIPLE OUTCOMES. (R824757)

EPA Science Inventory

We propose a multivariate linear mixed (MLMM) for the analysis of multiple outcomes, which generalizes the latent variable model of Sammel and Ryan. The proposed model assumes a flexible correlation structure among the multiple outcomes, and allows a global test of the impact of ...
A New Predictive Model of Centerline Segregation in Continuous Cast Steel Slabs by Using Multivariate Adaptive Regression Splines Approach

PubMed Central

García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María

2015-01-01

The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.
Identifying when tagged fishes have been consumed by piscivorous predators: application of multivariate mixture models to movement parameters of telemetered fishes

USGS Publications Warehouse

Romine, Jason G.; Perry, Russell W.; Johnston, Samuel V.; Fitzer, Christopher W.; Pagliughi, Stephen W.; Blake, Aaron R.

2013-01-01

Mixture models proved valuable as a means to differentiate between salmonid smolts and predators that consumed salmonid smolts. However, successful application of this method requires that telemetered fishes and their predators exhibit measurable differences in movement behavior. Our approach is flexible, allows inclusion of multiple track statistics and improves upon rule-based manual classification methods.
Comparing Within-Person Effects from Multivariate Longitudinal Models

ERIC Educational Resources Information Center

Bainter, Sierra A.; Howard, Andrea L.

2016-01-01

Several multivariate models are motivated to answer similar developmental questions regarding within-person (intraindividual) effects between 2 or more constructs over time, yet the within-person effects tested by each model are distinct. In this article, the authors clarify the types of within-person inferences that can be made from each model.…

Interpretability of Multivariate Brain Maps in Linear Brain Decoding: Definition, and Heuristic Quantification in Multivariate Analysis of MEG Time-Locked Effects.

PubMed

Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea

2016-01-01

Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Interpretability of Multivariate Brain Maps in Linear Brain Decoding: Definition, and Heuristic Quantification in Multivariate Analysis of MEG Time-Locked Effects

PubMed Central

Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea

2017-01-01

Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. PMID:28167896
A simplified dynamic model of the T700 turboshaft engine

NASA Technical Reports Server (NTRS)

Duyar, Ahmet; Gu, Zhen; Litt, Jonathan S.

1992-01-01

A simplified open-loop dynamic model of the T700 turboshaft engine, valid within the normal operating range of the engine, is developed. This model is obtained by linking linear state space models obtained at different engine operating points. Each linear model is developed from a detailed nonlinear engine simulation using a multivariable system identification and realization method. The simplified model may be used with a model-based real time diagnostic scheme for fault detection and diagnostics, as well as for open loop engine dynamics studies and closed loop control analysis utilizing a user generated control law.
Using state-space models to predict the abundance of juvenile and adult sea lice on Atlantic salmon.

PubMed

Elghafghuf, Adel; Vanderstichel, Raphael; St-Hilaire, Sophie; Stryhn, Henrik

2018-04-11

Sea lice are marine parasites affecting salmon farms, and are considered one of the most costly pests of the salmon aquaculture industry. Infestations of sea lice on farms significantly increase opportunities for the parasite to spread in the surrounding ecosystem, making control of this pest a challenging issue for salmon producers. The complexity of controlling sea lice on salmon farms requires frequent monitoring of the abundance of different sea lice stages over time. Industry-based data sets of counts of lice are amenable to multivariate time-series data analyses. In this study, two sets of multivariate autoregressive state-space models were applied to Chilean sea lice data from six Atlantic salmon production cycles on five isolated farms (at least 20 km seaway distance away from other known active farms), to evaluate the utility of these models for predicting sea lice abundance over time on farms. The models were constructed with different parameter configurations, and the analysis demonstrated large heterogeneity between production cycles for the autoregressive parameter, the effects of chemotherapeutant bath treatments, and the process-error variance. A model allowing for different parameters across production cycles had the best fit and the smallest overall prediction errors. However, pooling information across cycles for the drift and observation error parameters did not substantially affect model performance, thus reducing the number of necessary parameters in the model. Bath treatments had strong but variable effects for reducing sea lice burdens, and these effects were stronger for adult lice than juvenile lice. Our multivariate state-space models were able to handle different sea lice stages and provide predictions for sea lice abundance with reasonable accuracy up to five weeks out. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Modeling stochastic frontier based on vine copulas

NASA Astrophysics Data System (ADS)

Constantino, Michel; Candido, Osvaldo; Tabak, Benjamin M.; da Costa, Reginaldo Brito

2017-11-01

This article models a production function and analyzes the technical efficiency of listed companies in the United States, Germany and England between 2005 and 2012 based on the vine copula approach. Traditional estimates of the stochastic frontier assume that data is multivariate normally distributed and there is no source of asymmetry. The proposed method based on vine copulas allow us to explore different types of asymmetry and multivariate distribution. Using data on product, capital and labor, we measure the relative efficiency of the vine production function and estimate the coefficient used in the stochastic frontier literature for comparison purposes. This production vine copula predicts the value added by firms with given capital and labor in a probabilistic way. It thereby stands in sharp contrast to the production function, where the output of firms is completely deterministic. The results show that, on average, S&P500 companies are more efficient than companies listed in England and Germany, which presented similar average efficiency coefficients. For comparative purposes, the traditional stochastic frontier was estimated and the results showed discrepancies between the coefficients obtained by the application of the two methods, traditional and frontier-vine, opening new paths of non-linear research.
Multivariate Quantification of the Solid State Phase Composition of Co-Amorphous Naproxen-Indomethacin.

PubMed

Beyer, Andreas; Grohganz, Holger; Löbmann, Korbinian; Rades, Thomas; Leopold, Claudia S

2015-10-27

To benefit from the optimized dissolution properties of active pharmaceutical ingredients in their amorphous forms, co-amorphisation as a viable tool to stabilize these amorphous phases is of both academic and industrial interest. Reports dealing with the physical stability and recrystallization behavior of co-amorphous systems are however limited to qualitative evaluations based on the corresponding X-ray powder diffractograms. Therefore, the objective of the study was to develop a quantification model based on X-ray powder diffractometry (XRPD), followed by a multivariate partial least squares regression approach that enables the simultaneous determination of up to four solid state fractions: crystalline naproxen, γ-indomethacin, α-indomethacin as well as co-amorphous naproxen-indomethacin. For this purpose, a calibration set that covers the whole range of possible combinations of the four components was prepared and analyzed by XRPD. In order to test the model performances, leave-one-out cross validation was performed and revealed root mean square errors of validation between 3.11% and 3.45% for the crystalline molar fractions and 5.57% for the co-amorphous molar fraction. In summary, even four solid state phases, involving one co-amorphous phase, can be quantified with this XRPD data-based approach.
Early identification of patients requiring massive transfusion, embolization, or hemostatic surgery for traumatic hemorrhage: a systematic review protocol.

PubMed

Tran, Alexandre; Matar, Maher; Steyerberg, Ewout W; Lampron, Jacinthe; Taljaard, Monica; Vaillancourt, Christian

2017-04-13

Hemorrhage is a major cause of early mortality following a traumatic injury. The progression and consequences of significant blood loss occur quickly as death from hemorrhagic shock or exsanguination often occurs within the first few hours. The mainstay of treatment therefore involves early identification of patients at risk for hemorrhagic shock in order to provide blood products and control of the bleeding source if necessary. The intended scope of this review is to identify and assess combinations of predictors informing therapeutic decision-making for clinicians during the initial trauma assessment. The primary objective of this systematic review is to identify and critically assess any existing multivariable models predicting significant traumatic hemorrhage that requires intervention, defined as a composite outcome comprising massive transfusion, surgery for hemostasis, or angiography with embolization for the purpose of external validation or updating in other study populations. If no suitable existing multivariable models are identified, the secondary objective is to identify candidate predictors to inform the development of a new prediction rule. We will search the EMBASE and MEDLINE databases for all randomized controlled trials and prospective and retrospective cohort studies developing or validating predictors of intervention for traumatic hemorrhage in adult patients 16 years of age or older. Eligible predictors must be available to the clinician during the first hour of trauma resuscitation and may be clinical, lab-based, or imaging-based. Outcomes of interest include the need for surgical intervention, angiographic embolization, or massive transfusion within the first 24 h. Data extraction will be performed independently by two reviewers. Items for extraction will be based on the CHARMS checklist. We will evaluate any existing models for relevance, quality, and the potential for external validation and updating in other populations. Relevance will be described in terms of appropriateness of outcomes and predictors. Quality criteria will include variable selection strategies, adequacy of sample size, handling of missing data, validation techniques, and measures of model performance. This systematic review will describe the availability of multivariable prediction models and summarize evidence regarding predictors that can be used to identify the need for intervention in patients with traumatic hemorrhage. PROSPERO CRD42017054589.
Estimating the decomposition of predictive information in multivariate systems

NASA Astrophysics Data System (ADS)

Faes, Luca; Kugiumtzis, Dimitris; Nollo, Giandomenico; Jurysta, Fabrice; Marinazzo, Daniele

2015-03-01

In the study of complex systems from observed multivariate time series, insight into the evolution of one system may be under investigation, which can be explained by the information storage of the system and the information transfer from other interacting systems. We present a framework for the model-free estimation of information storage and information transfer computed as the terms composing the predictive information about the target of a multivariate dynamical process. The approach tackles the curse of dimensionality employing a nonuniform embedding scheme that selects progressively, among the past components of the multivariate process, only those that contribute most, in terms of conditional mutual information, to the present target process. Moreover, it computes all information-theoretic quantities using a nearest-neighbor technique designed to compensate the bias due to the different dimensionality of individual entropy terms. The resulting estimators of prediction entropy, storage entropy, transfer entropy, and partial transfer entropy are tested on simulations of coupled linear stochastic and nonlinear deterministic dynamic processes, demonstrating the superiority of the proposed approach over the traditional estimators based on uniform embedding. The framework is then applied to multivariate physiologic time series, resulting in physiologically well-interpretable information decompositions of cardiovascular and cardiorespiratory interactions during head-up tilt and of joint brain-heart dynamics during sleep.
Estimating multivariate similarity between neuroimaging datasets with sparse canonical correlation analysis: an application to perfusion imaging.

PubMed

Rosa, Maria J; Mehta, Mitul A; Pich, Emilio M; Risterucci, Celine; Zelaya, Fernando; Reinders, Antje A T S; Williams, Steve C R; Dazzan, Paola; Doyle, Orla M; Marquand, Andre F

2015-01-01

An increasing number of neuroimaging studies are based on either combining more than one data modality (inter-modal) or combining more than one measurement from the same modality (intra-modal). To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA). However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA), overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labeling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow.
Modeling in the quality by design environment: Regulatory requirements and recommendations for design space and control strategy appointment.

PubMed

Djuris, Jelena; Djuric, Zorica

2017-11-30

Mathematical models can be used as an integral part of the quality by design (QbD) concept throughout the product lifecycle for variety of purposes, including appointment of the design space and control strategy, continual improvement and risk assessment. Examples of different mathematical modeling techniques (mechanistic, empirical and hybrid) in the pharmaceutical development and process monitoring or control are provided in the presented review. In the QbD context, mathematical models are predominantly used to support design space and/or control strategies. Considering their impact to the final product quality, models can be divided into the following categories: high, medium and low impact models. Although there are regulatory guidelines on the topic of modeling applications, review of QbD-based submission containing modeling elements revealed concerns regarding the scale-dependency of design spaces and verification of models predictions at commercial scale of manufacturing, especially regarding real-time release (RTR) models. Authors provide critical overview on the good modeling practices and introduce concepts of multiple-unit, adaptive and dynamic design space, multivariate specifications and methods for process uncertainty analysis. RTR specification with mathematical model and different approaches to multivariate statistical process control supporting process analytical technologies are also presented. Copyright © 2017 Elsevier B.V. All rights reserved.
Meta-Analytic Structural Equation Modeling (MASEM): Comparison of the Multivariate Methods

ERIC Educational Resources Information Center

Zhang, Ying

2011-01-01

Meta-analytic Structural Equation Modeling (MASEM) has drawn interest from many researchers recently. In doing MASEM, researchers usually first synthesize correlation matrices across studies using meta-analysis techniques and then analyze the pooled correlation matrix using structural equation modeling techniques. Several multivariate methods of…
MULTIVARIATE RECEPTOR MODELS-CURRENT PRACTICE AND FUTURE TRENDS. (R826238)

EPA Science Inventory

Multivariate receptor models have been applied to the analysis of air quality data for sometime. However, solving the general mixture problem is important in several other fields. This paper looks at the panoply of these models with a view of identifying common challenges and ...
Nonlinear Decoupling Control With ANFIS-Based Unmodeled Dynamics Compensation for a Class of Complex Industrial Processes.

PubMed

Zhang, Yajun; Chai, Tianyou; Wang, Hong; Wang, Dianhui; Chen, Xinkai

2018-06-01

Complex industrial processes are multivariable and generally exhibit strong coupling among their control loops with heavy nonlinear nature. These make it very difficult to obtain an accurate model. As a result, the conventional and data-driven control methods are difficult to apply. Using a twin-tank level control system as an example, a novel multivariable decoupling control algorithm with adaptive neural-fuzzy inference system (ANFIS)-based unmodeled dynamics (UD) compensation is proposed in this paper for a class of complex industrial processes. At first, a nonlinear multivariable decoupling controller with UD compensation is introduced. Different from the existing methods, the decomposition estimation algorithm using ANFIS is employed to estimate the UD, and the desired estimating and decoupling control effects are achieved. Second, the proposed method does not require the complicated switching mechanism which has been commonly used in the literature. This significantly simplifies the obtained decoupling algorithm and its realization. Third, based on some new lemmas and theorems, the conditions on the stability and convergence of the closed-loop system are analyzed to show the uniform boundedness of all the variables. This is then followed by the summary on experimental tests on a heavily coupled nonlinear twin-tank system that demonstrates the effectiveness and the practicability of the proposed method.
Multivariate Geostatistical Analysis of Uncertainty for the Hydrodynamic Model of a Geological Trap for Carbon Dioxide Storage. Case study: Multilayered Geological Structure Vest Valcele, ROMANIA

NASA Astrophysics Data System (ADS)

Scradeanu, D.; Pagnejer, M.

2012-04-01

The purpose of the works is to evaluate the uncertainty of the hydrodynamic model for a multilayered geological structure, a potential trap for carbon dioxide storage. The hydrodynamic model is based on a conceptual model of the multilayered hydrostructure with three components: 1) spatial model; 2) parametric model and 3) energy model. The necessary data to achieve the three components of the conceptual model are obtained from: 240 boreholes explored by geophysical logging and seismic investigation, for the first two components, and an experimental water injection test for the last one. The hydrodinamic model is a finite difference numerical model based on a 3D stratigraphic model with nine stratigraphic units (Badenian and Oligocene) and a 3D multiparameter model (porosity, permeability, hydraulic conductivity, storage coefficient, leakage etc.). The uncertainty of the two 3D models was evaluated using multivariate geostatistical tools: a)cross-semivariogram for structural analysis, especially the study of anisotropy and b)cokriging to reduce estimation variances in a specific situation where is a cross-correlation between a variable and one or more variables that are undersampled. It has been identified important differences between univariate and bivariate anisotropy. The minimised uncertainty of the parametric model (by cokriging) was transferred to hydrodynamic model. The uncertainty distribution of the pressures generated by the water injection test has been additional filtered by the sensitivity of the numerical model. The obtained relative errors of the pressure distribution in the hydrodynamic model are 15-20%. The scientific research was performed in the frame of the European FP7 project "A multiple space and time scale approach for the quantification of deep saline formation for CO2 storage(MUSTANG)".
Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

PubMed Central

Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

2016-01-01

Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260
Regression Models For Multivariate Count Data

PubMed Central

Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

2016-01-01

Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. PMID:28348500
Regression Models For Multivariate Count Data.

PubMed

Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

2017-01-01

Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Evaluating a Multivariate Directional Connectivity Measure for Use in Electroencephalogram (EEG) Network Analysis Using a Conductance-Based Neuron Network Model

DTIC Science & Technology

2015-03-01

of 7 information -theoretic criteria plotted against the model order used . The legend is labeled according to the figures in which the power spectra...spectrum (Brovelli et al. 2004). 6 Fig. 2 Values of 7 information -theoretic criteria plotted against the model order used . The legend is labeled...Identification of directed influence: Granger causality, Kullback - Leibler divergence, and complexity. Neural Computation. 2012;24(7):1722–1739. doi:10.1162
The use of experimental design to find the operating maximum power point of PEM fuel cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Crăciunescu, Aurelian; Pătularu, Laurenţiu; Ciumbulea, Gloria

2015-03-10

Proton Exchange Membrane (PEM) Fuel Cells are difficult to model due to their complex nonlinear nature. In this paper, the development of a PEM Fuel Cells mathematical model based on the Design of Experiment methodology is described. The Design of Experiment provides a very efficient methodology to obtain a mathematical model for the studied multivariable system with only a few experiments. The obtained results can be used for optimization and control of the PEM Fuel Cells systems.
Multivariable control of the Space Shuttle Remote Manipulator System using linearization by state feedback. M.S. Thesis

NASA Technical Reports Server (NTRS)

Gettman, Chang-Ching LO

1993-01-01

This thesis develops and demonstrates an approach to nonlinear control system design using linearization by state feedback. The design provides improved transient response behavior allowing faster maneuvering of payloads by the SRMS. Modeling uncertainty is accounted for by using a second feedback loop designed around the feedback linearized dynamics. A classical feedback loop is developed to provide the easy implementation required for the relatively small on board computers. Feedback linearization also allows the use of higher bandwidth model based compensation in the outer loop, since it helps maintain stability in the presence of the nonlinearities typically neglected in model based designs.

Biomarkers for Early Detection of Clinically Relvant Prostate Cancer: A Multi-Institutional Validation Trial - Genomic Health, Inc. — EDRN Public Portal

Cancer.gov

Validate a panel of tissue-based biomarkers to determine the presence of or progression to clinically relevant prostate cancer at the time of diagnosis. Utilize a novel, biopsy based multi-gene quantitative RT-PCR assay developed by Genomic Health, Oncotype DX Prostate Cancer Assay, which discriminates aggressive from indolent cancer on multivariate modeling of PCa patients.
Predictive features of chronic kidney disease in atypical haemolytic uremic syndrome

PubMed Central

Jamme, Matthieu; Raimbourg, Quentin; Chauveau, Dominique; Seguin, Amélie; Presne, Claire; Perez, Pierre; Gobert, Pierre; Wynckel, Alain; Provôt, François; Delmas, Yahsou; Mousson, Christiane; Servais, Aude; Vrigneaud, Laurence; Veyradier, Agnès

2017-01-01

Chronic kidney disease (CKD) is a frequent and serious complication of atypical haemolytic uremic syndrome (aHUS). We aimed to develop a simple accurate model to predict the risk of renal dysfunction in aHUS based on clinical and biological features available at hospital admission. Renal function at 1-year follow-up, based on an estimated glomerular filtration rate < 60mL/min/1.73m2 as assessed by the Modification of Diet in Renal Disease equation, was used as an indicator of significant CKD. Prospectively collected data from a cohort of 156 aHUS patients who did not receive eculizumab were used to identify predictors of CKD. Covariates associated with renal impairment were identified by multivariate analysis. The model performance was assessed and a scoring system for clinical practice was constructed from the regression coefficient. Multivariate analyses identified three predictors of CKD: a high serum creatinine level, a high mean arterial pressure and a mildly decreased platelet count. The prognostic model had a good discriminative ability (area under the curve = .84). The scoring system ranged from 0 to 5, with corresponding risks of CKD ranging from 18% to 100%. This model accurately predicts development of 1-year CKD in patients with aHUS using clinical and biological features available on admission. After further validation, this model may assist in clinical decision making. PMID:28542627
An inflammation-based cumulative prognostic score system in patients with diffuse large B cell lymphoma in rituximab era.

PubMed

Sun, Feifei; Zhu, Jia; Lu, Suying; Zhen, Zijun; Wang, Juan; Huang, Junting; Ding, Zonghui; Zeng, Musheng; Sun, Xiaofei

2018-01-02

Systemic inflammatory parameters are associated with poor outcomes in malignant patients. Several inflammation-based cumulative prognostic score systems were established for various solid tumors. However, there is few inflammation based cumulative prognostic score system for patients with diffuse large B cell lymphoma (DLBCL). We retrospectively reviewed 564 adult DLBCL patients who had received rituximab, cyclophosphamide, doxorubicin, vincristine and prednisolone (R-CHOP) therapy between Nov 1 2006 and Dec 30 2013 and assessed the prognostic significance of six systemic inflammatory parameters evaluated in previous studies by univariate and multivariate analysis:C-reactive protein(CRP), albumin levels, the lymphocyte-monocyte ratio (LMR), the neutrophil-lymphocyte ratio(NLR), the platelet-lymphocyte ratio(PLR)and fibrinogen levels. Multivariate analysis identified CRP, albumin levels and the LMR are three independent prognostic parameters for overall survival (OS). Based on these three factors, we constructed a novel inflammation-based cumulative prognostic score (ICPS) system. Four risk groups were formed: group ICPS = 0, ICPS = 1, ICPS = 2 and ICPS = 3. Advanced multivariate analysis indicated that the ICPS model is a prognostic score system independent of International Prognostic Index (IPI) for both progression-free survival (PFS) (p < 0.001) and OS (p < 0.001). The 3-year OS for patients with ICPS =0, ICPS =1, ICPS =2 and ICPS =3 were 95.6, 88.2, 76.0 and 62.2%, respectively (p < 0.001). The 3-year PFS for patients with ICPS = 0-1, ICPS = 2 and ICPS = 3 were 84.8, 71.6 and 54.5%, respectively (p < 0.001). The prognostic value of the ICPS model indicated that the degree of systemic inflammatory status was associated with clinical outcomes of patients with DLBCL in rituximab era. The ICPS model was shown to classify risk groups more accurately than any single inflammatory prognostic parameters. These findings may be useful for identifying candidates for further inflammation-related mechanism research or novel anti-inflammation target therapies.
A "Model" Multivariable Calculus Course.

ERIC Educational Resources Information Center

Beckmann, Charlene E.; Schlicker, Steven J.

1999-01-01

Describes a rich, investigative approach to multivariable calculus. Introduces a project in which students construct physical models of surfaces that represent real-life applications of their choice. The models, along with student-selected datasets, serve as vehicles to study most of the concepts of the course from both continuous and discrete…
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

ERIC Educational Resources Information Center

Culpepper, Steven Andrew; Park, Trevor

2017-01-01

A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
A Sandwich-Type Standard Error Estimator of SEM Models with Multivariate Time Series

ERIC Educational Resources Information Center

Zhang, Guangjian; Chow, Sy-Miin; Ong, Anthony D.

2011-01-01

Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a…
Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains

PubMed Central

Krumin, Michael; Shoham, Shy

2010-01-01

Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705
Analysis of algae growth mechanism and water bloom prediction under the effect of multi-affecting factor.

PubMed

Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin

2017-03-01

The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Multivariate curve-resolution analysis of pesticides in water samples from liquid chromatographic-diode array data.

PubMed

Maggio, Rubén M; Damiani, Patricia C; Olivieri, Alejandro C

2011-01-30

Liquid chromatographic-diode array detection data recorded for aqueous mixtures of 11 pesticides show the combined presence of strongly coeluting peaks, distortions in the time dimension between experimental runs, and the presence of potential interferents not modeled by the calibration phase in certain test samples. Due to the complexity of these phenomena, data were processed by a second-order multivariate algorithm based on multivariate curve resolution and alternating least-squares, which allows one to successfully model both the spectral and retention time behavior for all sample constituents. This led to the accurate quantitation of all analytes in a set of validation samples: aldicarb sulfoxide, oxamyl, aldicarb sulfone, methomyl, 3-hydroxy-carbofuran, aldicarb, propoxur, carbofuran, carbaryl, 1-naphthol and methiocarb. Limits of detection in the range 0.1-2 μg mL(-1) were obtained. Additionally, the second-order advantage for several analytes was achieved in samples containing several uncalibrated interferences. The limits of detection for all analytes were decreased by solid phase pre-concentration to values compatible to those officially recommended, i.e., in the order of 5 ng mL(-1). Copyright Â© 2010 Elsevier B.V. All rights reserved.
A Unified Framework for Association Analysis with Multiple Related Phenotypes

PubMed Central

Stephens, Matthew

2013-01-01

We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations – that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5–10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data. PMID:23861737
Phylogenetic Factor Analysis.

PubMed

Tolkoff, Max R; Alfaro, Michael E; Baele, Guy; Lemey, Philippe; Suchard, Marc A

2018-05-01

Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over 3-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
A joint modeling and estimation method for multivariate longitudinal data with mixed types of responses to analyze physical activity data generated by accelerometers.

PubMed

Li, Haocheng; Zhang, Yukun; Carroll, Raymond J; Keadle, Sarah Kozey; Sampson, Joshua N; Matthews, Charles E

2017-11-10

A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the proposed model into a multivariate linear mixed model framework for estimation and inference. Via an extension to the EM approach, an efficient algorithm is developed to fit the model. The method is applied to physical activity data, which uses a wearable accelerometer device to measure daily movement and energy expenditure information. Our approach is also evaluated by a simulation study. Copyright © 2017 John Wiley & Sons, Ltd.
A tensor approach to modeling of nonhomogeneous nonlinear systems

NASA Technical Reports Server (NTRS)

Yurkovich, S.; Sain, M.

1980-01-01

Model following control methodology plays a key role in numerous application areas. Cases in point include flight control systems and gas turbine engine control systems. Typical uses of such a design strategy involve the determination of nonlinear models which generate requested control and response trajectories for various commands. Linear multivariable techniques provide trim about these motions; and protection logic is added to secure the hardware from excursions beyond the specification range. This paper reports upon experience in developing a general class of such nonlinear models based upon the idea of the algebraic tensor product.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula

NASA Astrophysics Data System (ADS)

Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.

2016-03-01

A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
Network-Based Visual Analysis of Tabular Data

ERIC Educational Resources Information Center

Liu, Zhicheng

2012-01-01

Tabular data is pervasive in the form of spreadsheets and relational databases. Although tables often describe multivariate data without explicit network semantics, it may be advantageous to explore the data modeled as a graph or network for analysis. Even when a given table design conveys some static network semantics, analysts may want to look…
ASCAL: A Microcomputer Program for Estimating Logistic IRT Item Parameters.

ERIC Educational Resources Information Center

Vale, C. David; Gialluca, Kathleen A.

ASCAL is a microcomputer-based program for calibrating items according to the three-parameter logistic model of item response theory. It uses a modified multivariate Newton-Raphson procedure for estimating item parameters. This study evaluated this procedure using Monte Carlo Simulation Techniques. The current version of ASCAL was then compared to…
A Penalized Likelihood Framework For High-Dimensional Phylogenetic Comparative Methods And An Application To New-World Monkeys Brain Evolution.

PubMed

Julien, Clavel; Leandro, Aristide; Hélène, Morlon

2018-06-19

Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Regional vertical total electron content (VTEC) modeling together with satellite and receiver differential code biases (DCBs) using semi-parametric multivariate adaptive regression B-splines (SP-BMARS)

NASA Astrophysics Data System (ADS)

Durmaz, Murat; Karslioglu, Mahmut Onur

2015-04-01

There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Part 2. Development of Enhanced Statistical Methods for Assessing Health Effects Associated with an Unknown Number of Major Sources of Multiple Air Pollutants.

PubMed

Park, Eun Sug; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford

2015-06-01

A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.
Describing the Elephant: Structure and Function in Multivariate Data.

ERIC Educational Resources Information Center

McDonald, Roderick P.

1986-01-01

There is a unity underlying the diversity of models for the analysis of multivariate data. Essentially, they constitute a family of models, most generally nonlinear, for structural/functional relations between variables drawn from a behavior domain. (Author)

Non-parametric directionality analysis - Extension for removal of a single common predictor and application to time series.

PubMed

Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob

2016-08-01

The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
Drunk driving detection based on classification of multivariate time series.

PubMed

Li, Zhenlong; Jin, Xue; Zhao, Xiaohua

2015-09-01

This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
Estimating brain connectivity when few data points are available: Perspectives and limitations.

PubMed

Antonacci, Yuri; Toppi, Jlenia; Caschera, Stefano; Anzolin, Alessandra; Mattia, Donatella; Astolfi, Laura

2017-07-01

Methods based on the use of multivariate autoregressive modeling (MVAR) have proved to be an accurate and flexible tool for the estimation of brain functional connectivity. The multivariate approach, however, implies the use of a model whose complexity (in terms of number of parameters) increases quadratically with the number of signals included in the problem. This can often lead to an underdetermined problem and to the condition of multicollinearity. The aim of this paper is to introduce and test an approach based on Ridge Regression combined with a modified version of the statistics usually adopted for these methods, to broaden the estimation of brain connectivity to those conditions in which current methods fail, due to the lack of enough data points. We tested the performances of this new approach, in comparison with the classical approach based on ordinary least squares (OLS), by means of a simulation study implementing different ground-truth networks, under different network sizes and different levels of data points. Simulation results showed that the new approach provides better performances, in terms of accuracy of the parameters estimation and false positives/false negatives rates, in all conditions related to a low data points/model dimension ratio, and may thus be exploited to estimate and validate estimated patterns at single-trial level or when short time data segments are available.
A prospective cohort study on radiation-induced hypothyroidism: development of an NTCP model.

PubMed

Boomsma, Marjolein J; Bijl, Hendrik P; Christianen, Miranda E M C; Beetz, Ivo; Chouvalova, Olga; Steenbakkers, Roel J H M; van der Laan, Bernard F A M; Wolffenbuttel, Bruce H R; Oosting, Sjoukje F; Schilstra, Cornelis; Langendijk, Johannes A

2012-11-01

To establish a multivariate normal tissue complication probability (NTCP) model for radiation-induced hypothyroidism. The thyroid-stimulating hormone (TSH) level of 105 patients treated with (chemo-) radiation therapy for head-and-neck cancer was prospectively measured during a median follow-up of 2.5 years. Hypothyroidism was defined as elevated serum TSH with decreased or normal free thyroxin (T4). A multivariate logistic regression model with bootstrapping was used to determine the most important prognostic variables for radiation-induced hypothyroidism. Thirty-five patients (33%) developed primary hypothyroidism within 2 years after radiation therapy. An NTCP model based on 2 variables, including the mean thyroid gland dose and the thyroid gland volume, was most predictive for radiation-induced hypothyroidism. NTCP values increased with higher mean thyroid gland dose (odds ratio [OR]: 1.064/Gy) and decreased with higher thyroid gland volume (OR: 0.826/cm(3)). Model performance was good with an area under the curve (AUC) of 0.85. This is the first prospective study resulting in an NTCP model for radiation-induced hypothyroidism. The probability of hypothyroidism rises with increasing dose to the thyroid gland, whereas it reduces with increasing thyroid gland volume. Copyright © 2012 Elsevier Inc. All rights reserved.
Multivariate analysis applied to the study of spatial distributions found in drug-eluting stent coatings by confocal Raman microscopy.

PubMed

Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A

2008-07-01

Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.
Prospective inverse associations of sex hormone concentrations in men with biomarkers of inflammation and oxidative stress.

PubMed

Haring, Robin; Baumeister, Sebastian E; Völzke, Henry; Dörr, Marcus; Kocher, Thomas; Nauck, Matthias; Wallaschofski, Henri

2012-01-01

The suggested associations between sex hormone concentrations and inflammatory biomarkers in men originate from cross-sectional studies and small-scale clinical trials. But prior studies have not investigated longitudinal associations. Overall, 1344 men aged 20-79 years from the population-based cohort Study of Health in Pomerania were followed up for 5.0 (median) years. We used multivariable regression models to analyze cross-sectional and longitudinal associations of serum sex hormone concentrations (total testosterone [TT], sex hormone-binding globulin [SHBG], calculated free testosterone [free T], and dehydroepiandrosterone sulfate [DHEAS]) with biomarkers of inflammation (fibrinogen, high-sensitive C-reactive protein [hsCRP], and white blood cell count [WBC]) and oxidative stress (γ-glutamyl transferase [GGT]) using ordinary least square regression and generalized estimating equation models, respectively. Cross-sectional models revealed borderline associations of sex hormone concentrations with hsCRP, WBC, and GGT levels that were not retained after multivariable adjustment. Longitudinal multivariable analyses revealed an inverse association of baseline TT, free T, and DHEAS concentrations with change in fibrinogen levels (per SD decrement in TT, 0.25 [95% confidence interval, 0.04-0.45]; in free T, 0.30 [0.09-0.51]; and in DHEAS, 0.23 [0.11-0.36]). Furthermore, baseline DHEAS concentrations were inversely associated with change in WBC levels (per SD decrement, 0.53 [0.24-0.82]). Baseline TT, SHBG, free T, and DHEAS concentrations were also inversely associated with change in GGT after multivariable adjustment. The present study is the first to demonstrate prospective inverse associations between sex hormone concentrations and markers of inflammation and oxidative stress in men. Additional studies are warranted to elucidate potential mechanisms underlying the revealed associations.
Pregnancy outcome of patients following bariatric surgery as compared with obese women: a population-based study.

PubMed

Shai, Daniel; Shoham-Vardi, Ilana; Amsalem, Doron; Silverberg, Daniel; Levi, Isaac; Sheiner, Eyal

2014-02-01

To evaluate pregnancy outcome and rates of anemia in patients following bariatric operation in comparison with obese pregnant women. A retrospective population-based study comparing pregnancy outcome of patients following bariatric with the obese population was conducted. Multivariate logistic regression models were constructed to control for confounders. To evaluate the change in hemoglobin levels, we included women who had one pregnancy before the bariatric surgery and one following the surgery or two pregnancies for women with obesity. This study included 326 women who had one pregnancy before and after a bariatric surgery and 1612 obese women who had at least two consecutive deliveries. Using a multivariable logistic regression model, controlling for confounders such as maternal age, patients following bariatric surgery had lower rates of gestational diabetes mellitus (OR 0.7; 95% CI 0.5-0.9; p = 0.49) and macrosomia (OR 0.3; 95% CI 0.2-0.5; p < 0.001) as compared with obese parturients. Women post bariatric surgery were more likely to be anemic (hemoglobin <10 g/dL) as compared to obese parturients (48% versus 37%; OR, 1.5; 95% CI, 1.2-1.9; p < 0.001). A significant decline in hemoglobin level was noted in patients following bariatric surgery (a decline of 0.33 g/dL versus 0.18 g/dL between two consecutive pregnancies of obese women). Using another multivariable model with anemia as the outcome variable, bariatric was noted as a risk factor for anemia (adjusted OR = 1.45, 95%CI 1.13-1.86, p = 0.004). Women following bariatric surgery have lower risk for gestational diabetes mellitus and fetal macrosomia as compared with obese parturients. Nevertheless, bariatric surgery is a risk factor for anemia.
Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.

PubMed

Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs

2009-02-01

This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.
Quantitative skeletal maturation estimation using cone-beam computed tomography-generated cervical vertebral images: a pilot study in 5- to 18-year-old Japanese children.

PubMed

Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Hwang, Dea-Seok; Park, Soo-Byung; Son, Woo-Sung

2015-11-01

The purpose of this study was to establish multivariable regression models for the estimation of skeletal maturation status in Japanese boys and girls using the cone-beam computed tomography (CBCT)-based cervical vertebral maturation (CVM) assessment method and hand-wrist radiography. The analyzed sample consisted of hand-wrist radiographs and CBCT images from 47 boys and 57 girls. To quantitatively evaluate the correlation between the skeletal maturation status and measurement ratios, a CBCT-based CVM assessment method was applied to the second, third, and fourth cervical vertebrae. Pearson's correlation coefficient analysis and multivariable regression analysis were used to determine the ratios for each of the cervical vertebrae (p < 0.05). Four characteristic parameters ((OH2 + PH2)/W2, (OH2 + AH2)/W2, D2, AH3/W3), as independent variables, were used to build the multivariable regression models: for the Japanese boys, the skeletal maturation status according to the CBCT-based quantitative cervical vertebral maturation (QCVM) assessment was 5.90 + 99.11 × AH3/W3 - 14.88 × (OH2 + AH2)/W2 + 13.24 × D2; for the Japanese girls, it was 41.39 + 59.52 × AH3/W3 - 15.88 × (OH2 + PH2)/W2 + 10.93 × D2. The CBCT-generated CVM images proved very useful to the definition of the cervical vertebral body and the odontoid process. The newly developed CBCT-based QCVM assessment method showed a high correlation between the derived ratios from the second cervical vertebral body and odontoid process. There are high correlations between the skeletal maturation status and the ratios of the second cervical vertebra based on the remnant of dentocentral synchondrosis.
Determination of thiamine HCl and pyridoxine HCl in pharmaceutical preparations using UV-visible spectrophotometry and genetic algorithm based multivariate calibration methods.

PubMed

Ozdemir, Durmus; Dinc, Erdal

2004-07-01

Simultaneous determination of binary mixtures pyridoxine hydrochloride and thiamine hydrochloride in a vitamin combination using UV-visible spectrophotometry and classical least squares (CLS) and three newly developed genetic algorithm (GA) based multivariate calibration methods was demonstrated. The three genetic multivariate calibration methods are Genetic Classical Least Squares (GCLS), Genetic Inverse Least Squares (GILS) and Genetic Regression (GR). The sample data set contains the UV-visible spectra of 30 synthetic mixtures (8 to 40 microg/ml) of these vitamins and 10 tablets containing 250 mg from each vitamin. The spectra cover the range from 200 to 330 nm in 0.1 nm intervals. Several calibration models were built with the four methods for the two components. Overall, the standard error of calibration (SEC) and the standard error of prediction (SEP) for the synthetic data were in the range of <0.01 and 0.43 microg/ml for all the four methods. The SEP values for the tablets were in the range of 2.91 and 11.51 mg/tablets. A comparison of genetic algorithm selected wavelengths for each component using GR method was also included.
A Network-Based Algorithm for Clustering Multivariate Repeated Measures Data

NASA Technical Reports Server (NTRS)

Koslovsky, Matthew; Arellano, John; Schaefer, Caroline; Feiveson, Alan; Young, Millennia; Lee, Stuart

2017-01-01

The National Aeronautics and Space Administration (NASA) Astronaut Corps is a unique occupational cohort for which vast amounts of measures data have been collected repeatedly in research or operational studies pre-, in-, and post-flight, as well as during multiple clinical care visits. In exploratory analyses aimed at generating hypotheses regarding physiological changes associated with spaceflight exposure, such as impaired vision, it is of interest to identify anomalies and trends across these expansive datasets. Multivariate clustering algorithms for repeated measures data may help parse the data to identify homogeneous groups of astronauts that have higher risks for a particular physiological change. However, available clustering methods may not be able to accommodate the complex data structures found in NASA data, since the methods often rely on strict model assumptions, require equally-spaced and balanced assessment times, cannot accommodate missing data or differing time scales across variables, and cannot process continuous and discrete data simultaneously. To fill this gap, we propose a network-based, multivariate clustering algorithm for repeated measures data that can be tailored to fit various research settings. Using simulated data, we demonstrate how our method can be used to identify patterns in complex data structures found in practice.
Predicting exposure-response associations of ambient particulate matter with mortality in 73 Chinese cities.

PubMed

Madaniyazi, Lina; Guo, Yuming; Chen, Renjie; Kan, Haidong; Tong, Shilu

2016-01-01

Estimating the burden of mortality associated with particulates requires knowledge of exposure-response associations. However, the evidence on exposure-response associations is limited in many cities, especially in developing countries. In this study, we predicted associations of particulates smaller than 10 μm in aerodynamic diameter (PM10) with mortality in 73 Chinese cities. The meta-regression model was used to test and quantify which city-specific characteristics contributed significantly to the heterogeneity of PM10-mortality associations for 16 Chinese cities. Then, those city-specific characteristics with statistically significant regression coefficients were treated as independent variables to build multivariate meta-regression models. The model with the best fitness was used to predict PM10-mortality associations in 73 Chinese cities in 2010. Mean temperature, PM10 concentration and green space per capita could best explain the heterogeneity in PM10-mortality associations. Based on city-specific characteristics, we were able to develop multivariate meta-regression models to predict associations between air pollutants and health outcomes reasonably well. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs

Treesearch

Daniel A. Yaussy; Robert L. Brisbin

1983-01-01

A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

DTIC Science & Technology

2017-09-01

efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Predictive and mechanistic multivariate linear regression models for reaction development

PubMed Central

Santiago, Celine B.; Guo, Jing-Yao

2018-01-01

Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
A Copula-Based Conditional Probabilistic Forecast Model for Wind Power Ramps

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hodge, Brian S; Krishnan, Venkat K; Zhang, Jie

Efficient management of wind ramping characteristics can significantly reduce wind integration costs for balancing authorities. By considering the stochastic dependence of wind power ramp (WPR) features, this paper develops a conditional probabilistic wind power ramp forecast (cp-WPRF) model based on Copula theory. The WPRs dataset is constructed by extracting ramps from a large dataset of historical wind power. Each WPR feature (e.g., rate, magnitude, duration, and start-time) is separately forecasted by considering the coupling effects among different ramp features. To accurately model the marginal distributions with a copula, a Gaussian mixture model (GMM) is adopted to characterize the WPR uncertaintymore » and features. The Canonical Maximum Likelihood (CML) method is used to estimate parameters of the multivariable copula. The optimal copula model is chosen based on the Bayesian information criterion (BIC) from each copula family. Finally, the best conditions based cp-WPRF model is determined by predictive interval (PI) based evaluation metrics. Numerical simulations on publicly available wind power data show that the developed copula-based cp-WPRF model can predict WPRs with a high level of reliability and sharpness.« less
Time-series panel analysis (TSPA): multivariate modeling of temporal associations in psychotherapy process.

PubMed

Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang

2014-10-01

Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo

NASA Astrophysics Data System (ADS)

Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik

2016-03-01

A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.
Prediction of the birch pollen season characteristics in Cracow, Poland using an 18-year data series.

PubMed

Dorota, Myszkowska

2013-03-01

The aim of the study was to construct the model forecasting the birch pollen season characteristics in Cracow on the basis of an 18-year data series. The study was performed using the volumetric method (Lanzoni/Burkard trap). The 98/95 % method was used to calculate the pollen season. The Spearman's correlation test was applied to find the relationship between the meteorological parameters and pollen season characteristics. To construct the predictive model, the backward stepwise multiple regression analysis was used including the multi-collinearity of variables. The predictive models best fitted the pollen season start and end, especially models containing two independent variables. The peak concentration value was predicted with the higher prediction error. Also the accuracy of the models predicting the pollen season characteristics in 2009 was higher in comparison with 2010. Both, the multi-variable model and one-variable model for the beginning of the pollen season included air temperature during the last 10 days of February, while the multi-variable model also included humidity at the beginning of April. The models forecasting the end of the pollen season were based on temperature in March-April, while the peak day was predicted using the temperature during the last 10 days of March.
Power of Models in Longitudinal Study: Findings from a Full-Crossed Simulation Design

ERIC Educational Resources Information Center

Fang, Hua; Brooks, Gordon P.; Rizzo, Maria L.; Espy, Kimberly Andrews; Barcikowski, Robert S.

2009-01-01

Because the power properties of traditional repeated measures and hierarchical multivariate linear models have not been clearly determined in the balanced design for longitudinal studies in the literature, the authors present a power comparison study of traditional repeated measures and hierarchical multivariate linear models under 3…

Species distribution modelling for plant communities: Stacked single species or multivariate modelling approaches?

Treesearch

Emilie B. Henderson; Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Harold S.J. Zald

2014-01-01

Landscape management and conservation planning require maps of vegetation composition and structure over large regions. Species distribution models (SDMs) are often used for individual species, but projects mapping multiple species are rarer. We compare maps of plant community composition assembled by stacking results from many SDMs with multivariate maps constructed...
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data

ERIC Educational Resources Information Center

Wang, Lijuan

2010-01-01

This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
Predictors of workplace sexual health policy at sex work establishments in the Philippines.

PubMed

Withers, M; Dornig, K; Morisky, D E

2007-09-01

Based on the literature, we identified manager and establishment characteristics that we hypothesized are related to workplace policies that support HIV protective behavior. We developed a sexual health policy index consisting of 11 items as our outcome variable. We utilized both bivariate and multivariate analysis of variance. The significant variables in our bivariate analyses (establishment type, number of employees, manager age, and membership in manager association) were entered into a multivariate regression model. The model was significant (p<.01), and predicted 42) of the variability in the development and management of a workplace sexual health policy supportive of condom use. The significant predictors were number of employees and establishment type. In addition to individually-focused CSW interventions, HIV prevention programs should target managers and establishment policies. Future HIV prevention programs may need to focus on helping smaller establishments, in particular those with less employees, to build capacity and develop sexual health policy guidelines.
The Impact of Work Ability on Work Motivation and Health: A Longitudinal Study Based on Older Employees.

PubMed

Feißel, Annemarie; Swart, Enno; March, Stefanie

2018-05-01

Work participation is determined by work motivation and work ability with health as a significant component. Within the lidA-study, we explore the impact of work ability on work motivation and health with consideration of further influencing factors. Four thousand one hundred nine older employees were interviewed two times (t0 = 2011, t1 = 2014). Two multivariate analyses were performed regarding the influence of work ability on work motivation (Model 1) and health (Model 2). Within the multivariate analysis, of all the influencing factors, work ability has the strongest effect on work motivation (F = 37.761) and health (F = 76.402). It appears as a decisive determinant for both dimensions. Regarding the results, it is useful to focus on the work ability of older employees in order to maintain and boost their work motivation and health.
Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models

PubMed Central

Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong

2017-01-01

To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696
Risk prediction for myocardial infarction via generalized functional regression models.

PubMed

Ieva, Francesca; Paganoni, Anna M

2016-08-01

In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
A review of the application of propensity score methods yielded increasing use, advantages in specific settings, but not substantially different estimates compared with conventional multivariable methods

PubMed Central

Stürmer, Til; Joshi, Manisha; Glynn, Robert J.; Avorn, Jerry; Rothman, Kenneth J.; Schneeweiss, Sebastian

2006-01-01

Objective Propensity score analyses attempt to control for confounding in non-experimental studies by adjusting for the likelihood that a given patient is exposed. Such analyses have been proposed to address confounding by indication, but there is little empirical evidence that they achieve better control than conventional multivariate outcome modeling. Study design and methods Using PubMed and Science Citation Index, we assessed the use of propensity scores over time and critically evaluated studies published through 2003. Results Use of propensity scores increased from a total of 8 papers before 1998 to 71 in 2003. Most of the 177 published studies abstracted assessed medications (N=60) or surgical interventions (N=51), mainly in cardiology and cardiac surgery (N=90). Whether PS methods or conventional outcome models were used to control for confounding had little effect on results in those studies in which such comparison was possible. Only 9 out of 69 studies (13%) had an effect estimate that differed by more than 20% from that obtained with a conventional outcome model in all PS analyses presented. Conclusions Publication of results based on propensity score methods has increased dramatically, but there is little evidence that these methods yield substantially different estimates compared with conventional multivariable methods. PMID:16632131
Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models.

PubMed

Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong

2017-02-01

To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Neeway, James J.; Rieke, Peter C.; Parruzot, Benjamin P.

In far-from-equilibrium conditions, the dissolution of borosilicate glasses used to immobilize nuclear waste is known to be a function of both temperature and pH. The aim of this paper is to study effects of these variables on three model waste glasses (SON68, ISG, AFCI). To do this, experiments were conducted at temperatures of 23, 40, 70, and 90 °C and pH(RT) values of 9, 10, 11, and 12 with the single-pass flow-through (SPFT) test method. The results from these tests were then used to parameterize a kinetic rate model based on transition state theory. Both the absolute dissolution rates andmore » the rate model parameters are compared with previous results. Discrepancies in the absolute dissolution rates as compared to those obtained using other test methods are discussed. Rate model parameters for the three glasses studied here are nearly equivalent within error and in relative agreement with previous studies. The results were analyzed with a linear multivariate regression (LMR) and a nonlinear multivariate regression performed with the use of the Glass Corrosion Modeling Tool (GCMT), which is capable of providing a robust uncertainty analysis. This robust analysis highlights the high degree of correlation of various parameters in the kinetic rate model. As more data are obtained on borosilicate glasses with varying compositions, the effect of glass composition on the rate parameter values could possibly be obtained. This would allow for the possibility of predicting the forward dissolution rate of glass based solely on composition« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens

We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

DOE PAGES

Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens; ...

2016-12-15

We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
A flexible model for multivariate interval-censored survival times with complex correlation structure.

PubMed

Falcaro, Milena; Pickles, Andrew

2007-02-10

We focus on the analysis of multivariate survival times with highly structured interdependency and subject to interval censoring. Such data are common in developmental genetics and genetic epidemiology. We propose a flexible mixed probit model that deals naturally with complex but uninformative censoring. The recorded ages of onset are treated as possibly censored ordinal outcomes with the interval censoring mechanism seen as arising from a coarsened measurement of a continuous variable observed as falling between subject-specific thresholds. This bypasses the requirement for the failure times to be observed as falling into non-overlapping intervals. The assumption of a normal age-of-onset distribution of the standard probit model is relaxed by embedding within it a multivariate Box-Cox transformation whose parameters are jointly estimated with the other parameters of the model. Complex decompositions of the underlying multivariate normal covariance matrix of the transformed ages of onset become possible. The new methodology is here applied to a multivariate study of the ages of first use of tobacco and first consumption of alcohol without parental permission in twins. The proposed model allows estimation of the genetic and environmental effects that are shared by both of these risk behaviours as well as those that are specific. 2006 John Wiley & Sons, Ltd.
Unsupervised classification of multivariate geostatistical data: Two algorithms

NASA Astrophysics Data System (ADS)

Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

2015-12-01

With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
A general framework for multivariate multi-index drought prediction based on Multivariate Ensemble Streamflow Prediction (MESP)

NASA Astrophysics Data System (ADS)

Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.

2016-08-01

Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources for early drought warning.
A nomogram based on mammary ductoscopic indicators for evaluating the risk of breast cancer in intraductal neoplasms with nipple discharge.

PubMed

Lian, Zhen-Qiang; Wang, Qi; Zhang, An-Qin; Zhang, Jiang-Yu; Han, Xiao-Rong; Yu, Hai-Yun; Xie, Si-Mei

2015-04-01

Mammary ductoscopy (MD) is commonly used to detect intraductal lesions associated with nipple discharge. This study investigated the relationships between ductoscopic image-based indicators and breast cancer risk, and developed a nomogram for evaluating breast cancer risk in intraductal neoplasms with nipple discharge. A total of 879 consecutive inpatients (916 breasts) with nipple discharge who underwent selective duct excision for intraductal neoplasms detected by MD from June 2008 to April 2014 were analyzed retrospectively. A nomogram was developed using a multivariate logistic regression model based on data from a training set (687 cases) and validated in an independent validation set (229 cases). A Youden-derived cut-off value was assigned to the nomogram for the diagnosis of breast cancer. Color of discharge, location, appearance, and surface of neoplasm, and morphology of ductal wall were independent predictors for breast cancer in multivariate logistic regression analysis. A nomogram based on these predictors performed well. The P value of the Hosmer-Lemeshow test for the prediction model was 0.36. Area under the curve values of 0.812 (95 % confidence interval (CI) 0.763-0.860) and 0.738 (95 % CI 0.635-0.841) was obtained in the training and validation sets, respectively. The accuracies of the nomogram for breast cancer diagnosis were 71.2 % in the training set and 75.5 % in the validation set. We developed a nomogram for evaluating breast cancer risk in intraductal neoplasms with nipple discharge based on MD image findings. This model may aid individual risk assessment and guide treatment in clinical practice.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-01-20

Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-02-01

Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Stichting European Society for Clinical Investigation Journal Foundation.
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-06

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD)

PubMed Central

Reitsma, Johannes B.; Altman, Douglas G.; Moons, Karel G.M.

2015-01-01

Background— Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. Methods— The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. Results— The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. Conclusions— To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25561516
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement

PubMed Central

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-01-01

Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25562432

Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-02-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Royal College of Obstetricians and Gynaecologists.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. The TRIPOD Group.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-13

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 The Authors.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-06

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-02-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). Copyright © 2015 Elsevier Inc. All rights reserved.
Data-Driven Nonlinear Subspace Modeling for Prediction and Control of Molten Iron Quality Indices in Blast Furnace Ironmaking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Ping; Song, Heda; Wang, Hong

Blast furnace (BF) in ironmaking is a nonlinear dynamic process with complicated physical-chemical reactions, where multi-phase and multi-field coupling and large time delay occur during its operation. In BF operation, the molten iron temperature (MIT) as well as Si, P and S contents of molten iron are the most essential molten iron quality (MIQ) indices, whose measurement, modeling and control have always been important issues in metallurgic engineering and automation field. This paper develops a novel data-driven nonlinear state space modeling for the prediction and control of multivariate MIQ indices by integrating hybrid modeling and control techniques. First, to improvemore » modeling efficiency, a data-driven hybrid method combining canonical correlation analysis and correlation analysis is proposed to identify the most influential controllable variables as the modeling inputs from multitudinous factors would affect the MIQ indices. Then, a Hammerstein model for the prediction of MIQ indices is established using the LS-SVM based nonlinear subspace identification method. Such a model is further simplified by using piecewise cubic Hermite interpolating polynomial method to fit the complex nonlinear kernel function. Compared to the original Hammerstein model, this simplified model can not only significantly reduce the computational complexity, but also has almost the same reliability and accuracy for a stable prediction of MIQ indices. Last, in order to verify the practicability of the developed model, it is applied in designing a genetic algorithm based nonlinear predictive controller for multivariate MIQ indices by directly taking the established model as a predictor. Industrial experiments show the advantages and effectiveness of the proposed approach.« less
Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

NASA Astrophysics Data System (ADS)

Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

2017-12-01

An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Multivariate-$t$ nonlinear mixed models with application to censored multi-outcome AIDS studies.

PubMed

Lin, Tsung-I; Wang, Wan-Lun

2017-10-01

In multivariate longitudinal HIV/AIDS studies, multi-outcome repeated measures on each patient over time may contain outliers, and the viral loads are often subject to a upper or lower limit of detection depending on the quantification assays. In this article, we consider an extension of the multivariate nonlinear mixed-effects model by adopting a joint multivariate-$t$ distribution for random effects and within-subject errors and taking the censoring information of multiple responses into account. The proposed model is called the multivariate-$t$ nonlinear mixed-effects model with censored responses (MtNLMMC), allowing for analyzing multi-outcome longitudinal data exhibiting nonlinear growth patterns with censorship and fat-tailed behavior. Utilizing the Taylor-series linearization method, a pseudo-data version of expectation conditional maximization either (ECME) algorithm is developed for iteratively carrying out maximum likelihood estimation. We illustrate our techniques with two data examples from HIV/AIDS studies. Experimental results signify that the MtNLMMC performs favorably compared to its Gaussian analogue and some existing approaches. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Multivariate analysis of longitudinal rates of change.

PubMed

Bryan, Matthew; Heagerty, Patrick J

2016-12-10

Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed in the literature. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, 'accelerated time' methods have been developed which assume that covariates rescale time in longitudinal models for disease progression. In this manuscript, we detail an alternative multivariate model formulation that directly structures longitudinal rates of change and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging

PubMed Central

Naylor, Melissa G.; Cardenas, Valerie A.; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

2015-01-01

Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remains a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. PMID:23408378
Novel risk score of contrast-induced nephropathy after percutaneous coronary intervention.

PubMed

Ji, Ling; Su, XiaoFeng; Qin, Wei; Mi, XuHua; Liu, Fei; Tang, XiaoHong; Li, Zi; Yang, LiChuan

2015-08-01

Contrast-induced nephropathy (CIN) post-percutaneous coronary intervention (PCI) is a major cause of acute kidney injury. In this study, we established a comprehensive risk score model to assess risk of CIN after PCI procedure, which could be easily used in a clinical environment. A total of 805 PCI patients, divided into analysis cohort (70%) and validation cohort (30%), were enrolled retrospectively in this study. Risk factors for CIN were identified using univariate analysis and multivariate logistic regression in the analysis cohort. Risk score model was developed based on multiple regression coefficients. Sensitivity and specificity of the new risk score system was validated in the validation cohort. Comparisons between the new risk score model and previous reported models were applied. The incidence of post-PCI CIN in the analysis cohort (n = 565) was 12%. Considerably high CIN incidence (50%) was observed in patients with chronic kidney disease (CKD). Age >75, body mass index (BMI) >25, myoglobin level, cardiac function level, hypoalbuminaemia, history of chronic kidney disease (CKD), Intra-aortic balloon pump (IABP) and peripheral vascular disease (PVD) were identified as independent risk factors of post-PCI CIN. A novel risk score model was established using multivariate regression coefficients, which showed highest sensitivity and specificity (0.917, 95%CI 0.877-0.957) compared with previous models. A new post-PCI CIN risk score model was developed based on a retrospective study of 805 patients. Application of this model might be helpful to predict CIN in patients undergoing PCI procedure. © 2015 Asian Pacific Society of Nephrology.
Decision-support models for empiric antibiotic selection in Gram-negative bloodstream infections.

PubMed

MacFadden, D R; Coburn, B; Shah, N; Robicsek, A; Savage, R; Elligsen, M; Daneman, N

2018-04-25

Early empiric antibiotic therapy in patients can improve clinical outcomes in Gram-negative bacteraemia. However, the widespread prevalence of antibiotic-resistant pathogens compromises our ability to provide adequate therapy while minimizing use of broad antibiotics. We sought to determine whether readily available electronic medical record data could be used to develop predictive models for decision support in Gram-negative bacteraemia. We performed a multi-centre cohort study, in Canada and the USA, of hospitalized patients with Gram-negative bloodstream infection from April 2010 to March 2015. We analysed multivariable models for prediction of antibiotic susceptibility at two empiric windows: Gram-stain-guided and pathogen-guided treatment. Decision-support models for empiric antibiotic selection were developed based on three clinical decision thresholds of acceptable adequate coverage (80%, 90% and 95%). A total of 1832 patients with Gram-negative bacteraemia were evaluated. Multivariable models showed good discrimination across countries and at both Gram-stain-guided (12 models, areas under the curve (AUCs) 0.68-0.89, optimism-corrected AUCs 0.63-0.85) and pathogen-guided (12 models, AUCs 0.75-0.98, optimism-corrected AUCs 0.64-0.95) windows. Compared to antibiogram-guided therapy, decision-support models of antibiotic selection incorporating individual patient characteristics and prior culture results have the potential to increase use of narrower-spectrum antibiotics (in up to 78% of patients) while reducing inadequate therapy. Multivariable models using readily available epidemiologic factors can be used to predict antimicrobial susceptibility in infecting pathogens with reasonable discriminatory ability. Implementation of sequential predictive models for real-time individualized empiric antibiotic decision-making has the potential to both optimize adequate coverage for patients while minimizing overuse of broad-spectrum antibiotics, and therefore requires further prospective evaluation. Readily available epidemiologic risk factors can be used to predict susceptibility of Gram-negative organisms among patients with bacteraemia, using automated decision-making models. Copyright © 2018 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
A generalized K statistic for estimating phylogenetic signal from shape and other high-dimensional multivariate data.

PubMed

Adams, Dean C

2014-09-01

Phylogenetic signal is the tendency for closely related species to display similar trait values due to their common ancestry. Several methods have been developed for quantifying phylogenetic signal in univariate traits and for sets of traits treated simultaneously, and the statistical properties of these approaches have been extensively studied. However, methods for assessing phylogenetic signal in high-dimensional multivariate traits like shape are less well developed, and their statistical performance is not well characterized. In this article, I describe a generalization of the K statistic of Blomberg et al. that is useful for quantifying and evaluating phylogenetic signal in highly dimensional multivariate data. The method (K(mult)) is found from the equivalency between statistical methods based on covariance matrices and those based on distance matrices. Using computer simulations based on Brownian motion, I demonstrate that the expected value of K(mult) remains at 1.0 as trait variation among species is increased or decreased, and as the number of trait dimensions is increased. By contrast, estimates of phylogenetic signal found with a squared-change parsimony procedure for multivariate data change with increasing trait variation among species and with increasing numbers of trait dimensions, confounding biological interpretations. I also evaluate the statistical performance of hypothesis testing procedures based on K(mult) and find that the method displays appropriate Type I error and high statistical power for detecting phylogenetic signal in high-dimensional data. Statistical properties of K(mult) were consistent for simulations using bifurcating and random phylogenies, for simulations using different numbers of species, for simulations that varied the number of trait dimensions, and for different underlying models of trait covariance structure. Overall these findings demonstrate that K(mult) provides a useful means of evaluating phylogenetic signal in high-dimensional multivariate traits. Finally, I illustrate the utility of the new approach by evaluating the strength of phylogenetic signal for head shape in a lineage of Plethodon salamanders. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
DUALITY IN MULTIVARIATE RECEPTOR MODEL. (R831078)

EPA Science Inventory

Multivariate receptor models are used for source apportionment of multiple observations of compositional data of air pollutants that obey mass conservation. Singular value decomposition of the data leads to two sets of eigenvectors. One set of eigenvectors spans a space in whi...
A model and nomogram to predict tumor site origin for squamous cell cancer confined to cervical lymph nodes.

PubMed

Ali, Arif N; Switchenko, Jeffrey M; Kim, Sungjin; Kowalski, Jeanne; El-Deiry, Mark W; Beitler, Jonathan J

2014-11-15

The current study was conducted to develop a multifactorial statistical model to predict the specific head and neck (H&N) tumor site origin in cases of squamous cell carcinoma confined to the cervical lymph nodes ("unknown primaries"). The Surveillance, Epidemiology, and End Results (SEER) database was analyzed for patients with an H&N tumor site who were diagnosed between 2004 and 2011. The SEER patients were identified according to their H&N primary tumor site and clinically positive cervical lymph node levels at the time of presentation. The SEER patient data set was randomly divided into 2 data sets for the purposes of internal split-sample validation. The effects of cervical lymph node levels, age, race, and sex on H&N primary tumor site were examined using univariate and multivariate analyses. Multivariate logistic regression models and an associated set of nomograms were developed based on relevant factors to provide probabilities of tumor site origin. Analysis of the SEER database identified 20,011 patients with H&N disease with both site-level and lymph node-level data. Sex, race, age, and lymph node levels were associated with primary H&N tumor site (nasopharynx, hypopharynx, oropharynx, and larynx) in the multivariate models. Internal validation techniques affirmed the accuracy of these models on separate data. The incorporation of epidemiologic and lymph node data into a predictive model has the potential to provide valuable guidance to clinicians in the treatment of patients with squamous cell carcinoma confined to the cervical lymph nodes. © 2014 The Authors. Cancer published by Wiley Periodicals, Inc. on behalf of American Cancer Society.
Prospects of second generation artificial intelligence tools in calibration of chemical sensors.

PubMed

Braibanti, Antonio; Rao, Rupenaguntla Sambasiva; Ramam, Veluri Anantha; Rao, Gollapalli Nageswara; Rao, Vaddadi Venkata Panakala

2005-05-01

Multivariate data driven calibration models with neural networks (NNs) are developed for binary (Cu++ and Ca++) and quaternary (K+, Ca++, NO3- and Cl-) ion-selective electrode (ISE) data. The response profiles of ISEs with concentrations are non-linear and sub-Nernstian. This task represents function approximation of multi-variate, multi-response, correlated, non-linear data with unknown noise structure i.e. multi-component calibration/prediction in chemometric parlance. Radial distribution function (RBF) and Fuzzy-ARTMAP-NN models implemented in the software packages, TRAJAN and Professional II, are employed for the calibration. The optimum NN models reported are based on residuals in concentration space. Being a data driven information technology, NN does not require a model, prior- or posterior- distribution of data or noise structure. Missing information, spikes or newer trends in different concentration ranges can be modeled through novelty detection. Two simulated data sets generated from mathematical functions are modeled as a function of number of data points and network parameters like number of neurons and nearest neighbors. The success of RBF and Fuzzy-ARTMAP-NNs to develop adequate calibration models for experimental data and function approximation models for more complex simulated data sets ensures AI2 (artificial intelligence, 2nd generation) as a promising technology in quantitation.
PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

NASA Astrophysics Data System (ADS)

Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

2014-10-01

The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the multivariate adaptive regression splines (MARS) technique, conclusions of this research work are exposed.
Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

PubMed

Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

2017-03-15

Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Multivariate modelling of endophenotypes associated with the metabolic syndrome in Chinese twins.

PubMed

Pang, Z; Zhang, D; Li, S; Duan, H; Hjelmborg, J; Kruse, T A; Kyvik, K O; Christensen, K; Tan, Q

2010-12-01

The common genetic and environmental effects on endophenotypes related to the metabolic syndrome have been investigated using bivariate and multivariate twin models. This paper extends the pairwise analysis approach by introducing independent and common pathway models to Chinese twin data. The aim was to explore the common genetic architecture in the development of these phenotypes in the Chinese population. Three multivariate models including the full saturated Cholesky decomposition model, the common factor independent pathway model and the common factor common pathway model were fitted to 695 pairs of Chinese twins representing six phenotypes including BMI, total cholesterol, total triacylglycerol, fasting glucose, HDL and LDL. Performances of the nested models were compared with that of the full Cholesky model. Cross-phenotype correlation coefficients gave clear indication of common genetic or environmental backgrounds in the phenotypes. Decomposition of phenotypic correlation by the Cholesky model revealed that the observed phenotypic correlation among lipid phenotypes had genetic and unique environmental backgrounds. Both pathway models suggest a common genetic architecture for lipid phenotypes, which is distinct from that of the non-lipid phenotypes. The declining performance with model restriction indicates biological heterogeneity in development among some of these phenotypes. Our multivariate analyses revealed common genetic and environmental backgrounds for the studied lipid phenotypes in Chinese twins. Model performance showed that physiologically distinct endophenotypes may follow different genetic regulations.
Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): the TRIPOD Statement.

PubMed

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-02-01

Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. A complete checklist is available at http://www.tripod-statement.org. © 2015 American College of Physicians.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

NASA Astrophysics Data System (ADS)

Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

2016-03-01

From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.

Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

PubMed Central

Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

2016-01-01

From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
Discrimination of inflammatory bowel disease using Raman spectroscopy and linear discriminant analysis methods

NASA Astrophysics Data System (ADS)

Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong

2016-03-01

Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
Multivariate meta-analysis: potential and promise.

PubMed

Jackson, Dan; Riley, Richard; White, Ian R

2011-09-10

The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Multivariate meta-analysis: Potential and promise

PubMed Central

Jackson, Dan; Riley, Richard; White, Ian R

2011-01-01

The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
BANYAN. XI. The BANYAN Σ Multivariate Bayesian Algorithm to Identify Members of Young Associations with 150 pc

NASA Astrophysics Data System (ADS)

Gagné, Jonathan; Mamajek, Eric E.; Malo, Lison; Riedel, Adric; Rodriguez, David; Lafrenière, David; Faherty, Jacqueline K.; Roy-Loubier, Olivier; Pueyo, Laurent; Robin, Annie C.; Doyon, René

2018-03-01

BANYAN Σ is a new Bayesian algorithm to identify members of young stellar associations within 150 pc of the Sun. It includes 27 young associations with ages in the range ∼1–800 Myr, modeled with multivariate Gaussians in six-dimensional (6D) XYZUVW space. It is the first such multi-association classification tool to include the nearest sub-groups of the Sco-Cen OB star-forming region, the IC 2602, IC 2391, Pleiades and Platais 8 clusters, and the ρ Ophiuchi, Corona Australis, and Taurus star formation regions. A model of field stars is built from a mixture of multivariate Gaussians based on the Besançon Galactic model. The algorithm can derive membership probabilities for objects with only sky coordinates and proper motion, but can also include parallax and radial velocity measurements, as well as spectrophotometric distance constraints from sequences in color–magnitude or spectral type–magnitude diagrams. BANYAN Σ benefits from an analytical solution to the Bayesian marginalization integrals over unknown radial velocities and distances that makes it more accurate and significantly faster than its predecessor BANYAN II. A contamination versus hit rate analysis is presented and demonstrates that BANYAN Σ achieves a better classification performance than other moving group tools available in the literature, especially in terms of cross-contamination between young associations. An updated list of bona fide members in the 27 young associations, augmented by the Gaia-DR1 release, as well as all parameters for the 6D multivariate Gaussian models for each association and the Galactic field neighborhood within 300 pc are presented. This new tool will make it possible to analyze large data sets such as the upcoming Gaia-DR2 to identify new young stars. IDL and Python versions of BANYAN Σ are made available with this publication, and a more limited online web tool is available at http://www.exoplanetes.umontreal.ca/banyan/banyansigma.php.
Stress and Personal Resource as Predictors of the Adjustment of Parents to Autistic Children: A Multivariate Model

ERIC Educational Resources Information Center

Siman-Tov, Ayelet; Kaniel, Shlomo

2011-01-01

The research validates a multivariate model that predicts parental adjustment to coping successfully with an autistic child. The model comprises four elements: parental stress, parental resources, parental adjustment and the child's autism symptoms. 176 parents of children aged between 6 to 16 diagnosed with PDD answered several questionnaires…
Multivariate mixed linear model analysis of longitudinal data: an information-rich statistical technique for analyzing disease resistance data

USDA-ARS?s Scientific Manuscript database

The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...
Multivariate Regression Analysis and Slaughter Livestock,

DTIC Science & Technology

AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Predicting Post-Treatment-Initiation Alcohol Use among Patients with Severe Mental Illness and Alcohol Use Disorders

ERIC Educational Resources Information Center

Bradizza, Clara M.; Maisto, Stephen A.; Vincent, Paula C.; Stasiewicz, Paul R.; Connors, Gerard J.; Mercer, Nicole D.

2009-01-01

Few investigators studying alcohol abuse among individuals with a severe mental illness (SMI) have examined predictors of posttreatment alcohol outcomes. In the present study, a multivariate approach based on a theoretical model was used to study the relationship between psychosocial factors and post-treatment-initiation alcohol use. Predictors of…
The Roles of Victim and Offender Substance Use in Sexual Assault Outcomes

ERIC Educational Resources Information Center

Brecklin, Leanne R.; Ullman, Sarah E.

2010-01-01

The impact of victim and offender preassault substance use on the outcomes of sexual assault incidents was analyzed. Nine hundred and seventy female sexual assault victims were identified from the first wave of a longitudinal study based on a convenience sampling strategy. Multivariate models showed that victim injury was more likely in assaults…
Real-time monitoring of a coffee roasting process with near infrared spectroscopy using multivariate statistical analysis: A feasibility study.

PubMed

Catelani, Tiago A; Santos, João Rodrigo; Páscoa, Ricardo N M J; Pezza, Leonardo; Pezza, Helena R; Lopes, João A

2018-03-01

This work proposes the use of near infrared (NIR) spectroscopy in diffuse reflectance mode and multivariate statistical process control (MSPC) based on principal component analysis (PCA) for real-time monitoring of the coffee roasting process. The main objective was the development of a MSPC methodology able to early detect disturbances to the roasting process resourcing to real-time acquisition of NIR spectra. A total of fifteen roasting batches were defined according to an experimental design to develop the MSPC models. This methodology was tested on a set of five batches where disturbances of different nature were imposed to simulate real faulty situations. Some of these batches were used to optimize the model while the remaining was used to test the methodology. A modelling strategy based on a time sliding window provided the best results in terms of distinguishing batches with and without disturbances, resourcing to typical MSPC charts: Hotelling's T 2 and squared predicted error statistics. A PCA model encompassing a time window of four minutes with three principal components was able to efficiently detect all disturbances assayed. NIR spectroscopy combined with the MSPC approach proved to be an adequate auxiliary tool for coffee roasters to detect faults in a conventional roasting process in real-time. Copyright © 2017 Elsevier B.V. All rights reserved.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique

NASA Astrophysics Data System (ADS)

Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong

2004-03-01

Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
DSM-based problem gambling: increasing the odds of heavy drinking in a national sample of U.S. college athletes?

PubMed

Huang, Jiun-Hau; Jacobs, Durand F; Derevensky, Jeffrey L

2011-03-01

Despite previously found co-occurrence of youth gambling and alcohol use, their relationship has not been systematically explored in a national sample using DSM-based gambling measures and multivariate modeling, adjusted for potential confounders. This study aimed to empirically examine the prevalence patterns and odds of at-least-weekly alcohol use and heavy episodic drinking (HED) in relation to various levels of gambling severity in college athletes. Multivariate logistic regression analyses were performed on data from a national sample of 20,739 U.S. college athletes from the first National Collegiate Athletic Association national survey of gambling and health-risk behaviors. Prevalence of at-least-weekly alcohol use significantly increased as DSM-IV-based gambling severity increased, from non-gambling (24.5%) to non-problem gambling (43.7%) to sub-clinical gambling (58.5%) to problem gambling (67.6%). Multivariate results indicated that all levels of gambling were associated with significantly elevated risk of at-least-weekly HED, from non-problem (OR = 1.25) to sub-clinical (OR = 1.75) to problem gambling (OR = 3.22); the steep increase in the relative risk also suggested a possible quadratic relationship between gambling level and HED risk. Notably, adjusted odds ratios showed problem gambling had the strongest association with at-least-weekly HED, followed by marijuana (OR = 3.08) and cigarette use (OR = 2.64). Gender interactions and differences were also identified and assessed. In conclusion, attention should be paid to college athletes exhibiting gambling problems, especially considering their empirical multivariate associations with high-risk drinking; accordingly, screening for problem gambling is recommended. More research is warranted to elucidate the etiologic mechanisms of these associations. Copyright © 2010 Elsevier Ltd. All rights reserved.
Univariate and multivariate spatial models of health facility utilisation for childhood fevers in an area on the coast of Kenya.

PubMed

Ouma, Paul O; Agutu, Nathan O; Snow, Robert W; Noor, Abdisalan M

2017-09-18

Precise quantification of health service utilisation is important for the estimation of disease burden and allocation of health resources. Current approaches to mapping health facility utilisation rely on spatial accessibility alone as the predictor. However, other spatially varying social, demographic and economic factors may affect the use of health services. The exclusion of these factors can lead to the inaccurate estimation of health facility utilisation. Here, we compare the accuracy of a univariate spatial model, developed only from estimated travel time, to a multivariate model that also includes relevant social, demographic and economic factors. A theoretical surface of travel time to the nearest public health facility was developed. These were assigned to each child reported to have had fever in the Kenya demographic and health survey of 2014 (KDHS 2014). The relationship of child treatment seeking for fever with travel time, household and individual factors from the KDHS2014 were determined using multilevel mixed modelling. Bayesian information criterion (BIC) and likelihood ratio test (LRT) tests were carried out to measure how selected factors improve parsimony and goodness of fit of the time model. Using the mixed model, a univariate spatial model of health facility utilisation was fitted using travel time as the predictor. The mixed model was also used to compute a multivariate spatial model of utilisation, using travel time and modelled surfaces of selected household and individual factors as predictors. The univariate and multivariate spatial models were then compared using the receiver operating area under the curve (AUC) and a percent correct prediction (PCP) test. The best fitting multivariate model had travel time, household wealth index and number of children in household as the predictors. These factors reduced BIC of the time model from 4008 to 2959, a change which was confirmed by the LRT test. Although there was a high correlation of the two modelled probability surfaces (Adj R 2 = 88%), the multivariate model had better AUC compared to the univariate model; 0.83 versus 0.73 and PCP 0.61 versus 0.45 values. Our study shows that a model that uses travel time, as well as household and individual-level socio-demographic factors, results in a more accurate estimation of use of health facilities for the treatment of childhood fever, compared to one that relies on only travel time.
Adjustment of automatic control systems of production facilities at coal processing plants using multivariant physico- mathematical models

NASA Astrophysics Data System (ADS)

Evtushenko, V. F.; Myshlyaev, L. P.; Makarov, G. V.; Ivushkin, K. A.; Burkova, E. V.

2016-10-01

The structure of multi-variant physical and mathematical models of control system is offered as well as its application for adjustment of automatic control system (ACS) of production facilities on the example of coal processing plant.
A simplified parsimonious higher order multivariate Markov chain model with new convergence condition

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, we present a simplified parsimonious higher-order multivariate Markov chain model with new convergence condition. (TPHOMMCM-NCC). Moreover, estimation method of the parameters in TPHOMMCM-NCC is give. Numerical experiments illustrate the effectiveness of TPHOMMCM-NCC.
Gene flow from domesticated species to wild relatives: migration load in a model of multivariate selection.

PubMed

Tufto, Jarle

2010-01-01

Domesticated species frequently spread their genes into populations of wild relatives through interbreeding. The domestication process often involves artificial selection for economically desirable traits. This can lead to an indirect response in unknown correlated traits and a reduction in fitness of domesticated individuals in the wild. Previous models for the effect of gene flow from domesticated species to wild relatives have assumed that evolution occurs in one dimension. Here, I develop a quantitative genetic model for the balance between migration and multivariate stabilizing selection. Different forms of correlational selection consistent with a given observed ratio between average fitness of domesticated and wild individuals offsets the phenotypic means at migration-selection balance away from predictions based on simpler one-dimensional models. For almost all parameter values, correlational selection leads to a reduction in the migration load. For ridge selection, this reduction arises because the distance the immigrants deviates from the local optimum in effect is reduced. For realistic parameter values, however, the effect of correlational selection on the load is small, suggesting that simpler one-dimensional models may still be adequate in terms of predicting mean population fitness and viability.
Various forms of indexing HDMR for modelling multivariate classification problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aksu, Çağrı; Tunga, M. Alper

2014-12-10

The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled.more » In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory.« less
Multivariate random-parameters zero-inflated negative binomial regression model: an application to estimate crash frequencies at intersections.

PubMed

Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan

2014-09-01

Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
Insights on multivariate updates of physical and biogeochemical ocean variables using an Ensemble Kalman Filter and an idealized model of upwelling

NASA Astrophysics Data System (ADS)

Yu, Liuqian; Fennel, Katja; Bertino, Laurent; Gharamti, Mohamad El; Thompson, Keith R.

2018-06-01

Effective data assimilation methods for incorporating observations into marine biogeochemical models are required to improve hindcasts, nowcasts and forecasts of the ocean's biogeochemical state. Recent assimilation efforts have shown that updating model physics alone can degrade biogeochemical fields while only updating biogeochemical variables may not improve a model's predictive skill when the physical fields are inaccurate. Here we systematically investigate whether multivariate updates of physical and biogeochemical model states are superior to only updating either physical or biogeochemical variables. We conducted a series of twin experiments in an idealized ocean channel that experiences wind-driven upwelling. The forecast model was forced with biased wind stress and perturbed biogeochemical model parameters compared to the model run representing the "truth". Taking advantage of the multivariate nature of the deterministic Ensemble Kalman Filter (DEnKF), we assimilated different combinations of synthetic physical (sea surface height, sea surface temperature and temperature profiles) and biogeochemical (surface chlorophyll and nitrate profiles) observations. We show that when biogeochemical and physical properties are highly correlated (e.g., thermocline and nutricline), multivariate updates of both are essential for improving model skill and can be accomplished by assimilating either physical (e.g., temperature profiles) or biogeochemical (e.g., nutrient profiles) observations. In our idealized domain, the improvement is largely due to a better representation of nutrient upwelling, which results in a more accurate nutrient input into the euphotic zone. In contrast, assimilating surface chlorophyll improves the model state only slightly, because surface chlorophyll contains little information about the vertical density structure. We also show that a degradation of the correlation between observed subsurface temperature and nutrient fields, which has been an issue in several previous assimilation studies, can be reduced by multivariate updates of physical and biogeochemical fields.

Universal Pressure Ulcer Prevention Bundle With WOC Nurse Support.

PubMed

Anderson, Megan; Finch Guthrie, Patricia; Kraft, Wendy; Reicks, Patty; Skay, Carol; Beal, Alan L

2015-01-01

This study examined the effectiveness of a universal pressure ulcer prevention bundle (UPUPB) applied to intensive care unit (ICU) patients combined with proactive, semiweekly WOC nurse rounds. The UPUBP was compared to a standard guideline with referral-based WOC nurse involvement measuring adherence to 5 evidence-based prevention interventions and incidence of pressure ulcers. The study used a quasi-experimental, pre-, and postintervention design in which each phase included different subjects. Descriptive methods assisted in exploring the content of WOC nurse rounds. One hundred eighty-one pre- and 146 postintervention subjects who met inclusion criteria and were admitted to ICU for more than 24 hours participated in the study. The research setting was 3 ICUs located at North Memorial Medical Center in Minneapolis, Minnesota. Data collection included admission/discharge skin assessments, chart reviews for 5 evidence-based interventions and patient characteristics, and WOC nurse rounding logs. Study subjects with intact skin on admission identified with an initial skin assessment were enrolled in which prephase subjects received standard care and postphase subjects received the UPUPB. Skin assessments on ICU discharge and chart reviews throughout the stay determined the presence of unit-acquired pressure ulcers and skin care received. Analysis included description of WOC nurse rounds, t-tests for guideline adherence, and multivariate analysis for intervention effect on pressure ulcer incidence. Unit assignment, Braden Scale score, and ICU length of stay were covariates for a multivariate model based on bivariate logistic regression screening. The incidence of unit-acquired pressure ulcers decreased from 15.5% to 2.1%. WOC nurses logged 204 rounds over 6 months, focusing primarily on early detection of pressure sources. Data analysis revealed significantly increased adherence to heel elevation (t = -3.905, df = 325, P < .001) and repositioning (t = -2.441, df = 325, P < .015). Multivariate logistic regression modeling showed a significant reduction in unit-acquired pressure ulcers (P < .001). The intervention increased the Nagelkerke R-Square value by 0.099 (P < .001) more than 0.297 (P < .001) when including only covariates, for a final model value of 0.396 (P < .001). The UPUPB with WOC nurse rounds resulted in a statistically significant and clinically relevant reduction in the incidence of pressure ulcers.
Socioeconomic Drought in a Changing Climate: Modeling and Management

NASA Astrophysics Data System (ADS)

AghaKouchak, Amir; Mehran, Ali; Mazdiyasni, Omid

2016-04-01

Drought is typically defined based on meteorological, hydrological and land surface conditions. However, in many parts of the world, anthropogenic changes and water management practices have significantly altered local water availability. Socioeconomic drought refers to conditions whereby the available water supply cannot satisfy the human and environmental water needs. Surface water reservoirs provide resilience against local climate variability (e.g., droughts), and play a major role in regional water management. This presentation focuses on a framework for describing socioeconomic drought based on both water supply and demand information. We present a multivariate approach as a measure of socioeconomic drought, termed Multivariate Standardized Reliability and Resilience Index (MSRRI; Mehran et al., 2015). This model links the information on inflow and surface reservoir storage to water demand. MSRRI integrates a "top-down" and a "bottom-up" approach for describing socioeconomic drought. The "top-down" component describes processes that cannot be simply controlled or altered by local decision-makers and managers (e.g., precipitation, climate variability, climate change), whereas the "bottom-up" component focuses on the local resilience, and societal capacity to respond to droughts. The two components (termed, Inflow-Demand Reliability (IDR) indicator and Water Storage Resilience (WSR) indicator) are integrated using a nonparametric multivariate approach. We use this framework to assess the socioeconomic drought during the Australian Millennium Drought (1998-2010) and the 2011-2014 California Droughts. MSRRI provides additional information on socioeconomic drought onset, development and termination based on local resilience and human demand that cannot be obtained from the commonly used drought indicators. We show that MSRRI can be used for water management scenario analysis (e.g., local water availability based on different human water demands scenarios). Finally, we provide examples of using the proposed modeling framework for analyzing water availability in a changing climate considering local conditions. Reference: Mehran A., Mazdiyasni O., AghaKouchak A., 2015, A Hybrid Framework for Assessing Socioeconomic Drought: Linking Climate Variability, Local Resilience, and Demand, Journal of Geophysical Research, 120 (15), 7520-7533, doi: 10.1002/2015JD023147
Taste phenotype associates with cardiovascular disease risk factors via diet quality in multivariate modeling.

PubMed

Sharafi, Mastaneh; Rawal, Shristi; Fernandez, Maria Luz; Huedo-Medina, Tania B; Duffy, Valerie B

2018-05-08

Sensations from foods and beverages drive dietary choices, which in turn, affect risk of diet-related diseases. Perception of these sensation varies with environmental and genetic influences. This observational study aimed to examine associations between chemosensory phenotype, diet and cardiovascular disease (CVD) risk. Reportedly healthy women (n = 110, average age 45 ± 9 years) participated in laboratory-based measures of chemosensory phenotype (taste and smell function, propylthiouracil (PROP) bitterness) and CVD risk factors (waist circumference, blood pressure, serum lipids). Diet variables included preference and intake of sweet/high-fat foods, dietary restraint, and diet quality based on reported preference (Healthy Eating Preference Index-HEPI) and intake (Healthy Eating Index-HEI). We found that females who reported high preference yet low consumption of sweet/high-fat foods had the highest dietary restraint and depressed quinine taste function. PROP nontasters were more likely to report lower diet quality; PROP supertasters more likely to consume but not like a healthy diet. Multivariate structural models were fitted to identify predictors of CVD risk factors. Reliable latent taste (quinine taste function, PROP tasting) and smell (odor intensity) variables were identified, with taste explaining more variance in the CVD risk factors. Lower bitter taste perception was associated with elevated risk. In multivariate models, the HEPI completely mediated the taste-adiposity and taste-HDL associations and partially mediated the taste-triglyceride or taste-systolic blood pressure associations. The taste-LDL pathway was significant and direct. The HEI could not replace HEPI in adequate models. However, using a latent diet quality variable with HEPI and HEI, increased the strength of association between diet quality and adiposity or CVD risk factors. In conclusion, bitter taste phenotype was associated with CVD risk factors via diet quality, particularly when assessed by level of food liking/disliking. Copyright © 2018 Elsevier Inc. All rights reserved.
Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions

PubMed Central

2013-01-01

Background Recently, one of the greatest challenges in genome-wide association studies is to detect gene-gene and/or gene-environment interactions for common complex human diseases. Ritchie et al. (2001) proposed multifactor dimensionality reduction (MDR) method for interaction analysis. MDR is a combinatorial approach to reduce multi-locus genotypes into high-risk and low-risk groups. Although MDR has been widely used for case-control studies with binary phenotypes, several extensions have been proposed. One of these methods, a generalized MDR (GMDR) proposed by Lou et al. (2007), allows adjusting for covariates and applying to both dichotomous and continuous phenotypes. GMDR uses the residual score of a generalized linear model of phenotypes to assign either high-risk or low-risk group, while MDR uses the ratio of cases to controls. Methods In this study, we propose multivariate GMDR, an extension of GMDR for multivariate phenotypes. Jointly analysing correlated multivariate phenotypes may have more power to detect susceptible genes and gene-gene interactions. We construct generalized estimating equations (GEE) with multivariate phenotypes to extend generalized linear models. Using the score vectors from GEE we discriminate high-risk from low-risk groups. We applied the multivariate GMDR method to the blood pressure data of the 7,546 subjects from the Korean Association Resource study: systolic blood pressure (SBP) and diastolic blood pressure (DBP). We compare the results of multivariate GMDR for SBP and DBP to the results from separate univariate GMDR for SBP and DBP, respectively. We also applied the multivariate GMDR method to the repeatedly measured hypertension status from 5,466 subjects and compared its result with those of univariate GMDR at each time point. Results Results from the univariate GMDR and multivariate GMDR in two-locus model with both blood pressures and hypertension phenotypes indicate best combinations of SNPs whose interaction has significant association with risk for high blood pressures or hypertension. Although the test balanced accuracy (BA) of multivariate analysis was not always greater than that of univariate analysis, the multivariate BAs were more stable with smaller standard deviations. Conclusions In this study, we have developed multivariate GMDR method using GEE approach. It is useful to use multivariate GMDR with correlated multiple phenotypes of interests. PMID:24565370
A Novel MiRNA-Based Predictive Model for Biochemical Failure Following Post-Prostatectomy Salvage Radiation Therapy

PubMed Central

Stegmaier, Petra; Drendel, Vanessa; Mo, Xiaokui; Ling, Stella; Fabian, Denise; Manring, Isabel; Jilg, Cordula A.; Schultze-Seemann, Wolfgang; McNulty, Maureen; Zynger, Debra L.; Martin, Douglas; White, Julia; Werner, Martin; Grosu, Anca L.; Chakravarti, Arnab

2015-01-01

Purpose To develop a microRNA (miRNA)-based predictive model for prostate cancer patients of 1) time to biochemical recurrence after radical prostatectomy and 2) biochemical recurrence after salvage radiation therapy following documented biochemical disease progression post-radical prostatectomy. Methods Forty three patients who had undergone salvage radiation therapy following biochemical failure after radical prostatectomy with greater than 4 years of follow-up data were identified. Formalin-fixed, paraffin-embedded tissue blocks were collected for all patients and total RNA was isolated from 1mm cores enriched for tumor (>70%). Eight hundred miRNAs were analyzed simultaneously using the nCounter human miRNA v2 assay (NanoString Technologies; Seattle, WA). Univariate and multivariate Cox proportion hazards regression models as well as receiver operating characteristics were used to identify statistically significant miRNAs that were predictive of biochemical recurrence. Results Eighty eight miRNAs were identified to be significantly (p<0.05) associated with biochemical failure post-prostatectomy by multivariate analysis and clustered into two groups that correlated with early (≤ 36 months) versus late recurrence (>36 months). Nine miRNAs were identified to be significantly (p<0.05) associated by multivariate analysis with biochemical failure after salvage radiation therapy. A new predictive model for biochemical recurrence after salvage radiation therapy was developed; this model consisted of miR-4516 and miR-601 together with, Gleason score, and lymph node status. The area under the ROC curve (AUC) was improved to 0.83 compared to that of 0.66 for Gleason score and lymph node status alone. Conclusion miRNA signatures can distinguish patients who fail soon after radical prostatectomy versus late failures, giving insight into which patients may need adjuvant therapy. Notably, two novel miRNAs (miR-4516 and miR-601) were identified that significantly improve prediction of biochemical failure post-salvage radiation therapy compared to clinico-histopathological factors, supporting the use of miRNAs within clinically used predictive models. Both findings warrant further validation studies. PMID:25760964
Quantum attack-resistent certificateless multi-receiver signcryption scheme.

PubMed

Li, Huixian; Chen, Xubao; Pang, Liaojun; Shi, Weisong

2013-01-01

The existing certificateless signcryption schemes were designed mainly based on the traditional public key cryptography, in which the security relies on the hard problems, such as factor decomposition and discrete logarithm. However, these problems will be easily solved by the quantum computing. So the existing certificateless signcryption schemes are vulnerable to the quantum attack. Multivariate public key cryptography (MPKC), which can resist the quantum attack, is one of the alternative solutions to guarantee the security of communications in the post-quantum age. Motivated by these concerns, we proposed a new construction of the certificateless multi-receiver signcryption scheme (CLMSC) based on MPKC. The new scheme inherits the security of MPKC, which can withstand the quantum attack. Multivariate quadratic polynomial operations, which have lower computation complexity than bilinear pairing operations, are employed in signcrypting a message for a certain number of receivers in our scheme. Security analysis shows that our scheme is a secure MPKC-based scheme. We proved its security under the hardness of the Multivariate Quadratic (MQ) problem and its unforgeability under the Isomorphism of Polynomials (IP) assumption in the random oracle model. The analysis results show that our scheme also has the security properties of non-repudiation, perfect forward secrecy, perfect backward secrecy and public verifiability. Compared with the existing schemes in terms of computation complexity and ciphertext length, our scheme is more efficient, which makes it suitable for terminals with low computation capacity like smart cards.
A Method for Comparing Multivariate Time Series with Different Dimensions

PubMed Central

Tapinos, Avraam; Mendes, Pedro

2013-01-01

In many situations it is desirable to compare dynamical systems based on their behavior. Similarity of behavior often implies similarity of internal mechanisms or dependency on common extrinsic factors. While there are widely used methods for comparing univariate time series, most dynamical systems are characterized by multivariate time series. Yet, comparison of multivariate time series has been limited to cases where they share a common dimensionality. A semi-metric is a distance function that has the properties of non-negativity, symmetry and reflexivity, but not sub-additivity. Here we develop a semi-metric – SMETS – that can be used for comparing groups of time series that may have different dimensions. To demonstrate its utility, the method is applied to dynamic models of biochemical networks and to portfolios of shares. The former is an example of a case where the dependencies between system variables are known, while in the latter the system is treated (and behaves) as a black box. PMID:23393554
Liquid chromatography with diode array detection and multivariate curve resolution for the selective and sensitive quantification of estrogens in natural waters.

PubMed

Pérez, Rocío L; Escandar, Graciela M

2014-07-04

Following the green analytical chemistry principles, an efficient strategy involving second-order data provided by liquid chromatography (LC) with diode array detection (DAD) was applied for the simultaneous determination of estriol, 17β-estradiol, 17α-ethinylestradiol and estrone in natural water samples. After a simple pre-concentration step, LC-DAD matrix data were rapidly obtained (in less than 5 min) with a chromatographic system operating isocratically. Applying a second-order calibration algorithm based on multivariate curve resolution with alternating least-squares (MCR-ALS), successful resolution was achieved in the presence of sample constituents that strongly coelute with the analytes. The flexibility of this multivariate model allowed the quantification of the four estrogens in tap, mineral, underground and river water samples. Limits of detection in the range between 3 and 13 ng L(-1), and relative prediction errors from 2 to 11% were achieved. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariate fault isolation of batch processes via variable selection in partial least squares discriminant analysis.

PubMed

Yan, Zhengbing; Kuang, Te-Hui; Yao, Yuan

2017-09-01

In recent years, multivariate statistical monitoring of batch processes has become a popular research topic, wherein multivariate fault isolation is an important step aiming at the identification of the faulty variables contributing most to the detected process abnormality. Although contribution plots have been commonly used in statistical fault isolation, such methods suffer from the smearing effect between correlated variables. In particular, in batch process monitoring, the high autocorrelations and cross-correlations that exist in variable trajectories make the smearing effect unavoidable. To address such a problem, a variable selection-based fault isolation method is proposed in this research, which transforms the fault isolation problem into a variable selection problem in partial least squares discriminant analysis and solves it by calculating a sparse partial least squares model. As different from the traditional methods, the proposed method emphasizes the relative importance of each process variable. Such information may help process engineers in conducting root-cause diagnosis. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Multiscale climate emulator of multimodal wave spectra: MUSCLE-spectra

NASA Astrophysics Data System (ADS)

Rueda, Ana; Hegermiller, Christie A.; Antolinez, Jose A. A.; Camus, Paula; Vitousek, Sean; Ruggiero, Peter; Barnard, Patrick L.; Erikson, Li H.; Tomás, Antonio; Mendez, Fernando J.

2017-02-01

Characterization of multimodal directional wave spectra is important for many offshore and coastal applications, such as marine forecasting, coastal hazard assessment, and design of offshore wave energy farms and coastal structures. However, the multivariate and multiscale nature of wave climate variability makes this complex problem tractable using computationally expensive numerical models. So far, the skill of statistical-downscaling model-based parametric (unimodal) wave conditions is limited in large ocean basins such as the Pacific. The recent availability of long-term directional spectral data from buoys and wave hindcast models allows for development of stochastic models that include multimodal sea-state parameters. This work introduces a statistical downscaling framework based on weather types to predict multimodal wave spectra (e.g., significant wave height, mean wave period, and mean wave direction from different storm systems, including sea and swells) from large-scale atmospheric pressure fields. For each weather type, variables of interest are modeled using the categorical distribution for the sea-state type, the Generalized Extreme Value (GEV) distribution for wave height and wave period, a multivariate Gaussian copula for the interdependence between variables, and a Markov chain model for the chronology of daily weather types. We apply the model to the southern California coast, where local seas and swells from both the Northern and Southern Hemispheres contribute to the multimodal wave spectrum. This work allows attribution of particular extreme multimodal wave events to specific atmospheric conditions, expanding knowledge of time-dependent, climate-driven offshore and coastal sea-state conditions that have a significant influence on local nearshore processes, coastal morphology, and flood hazards.
Multiscale Climate Emulator of Multimodal Wave Spectra: MUSCLE-spectra

NASA Astrophysics Data System (ADS)

Rueda, A.; Hegermiller, C.; Alvarez Antolinez, J. A.; Camus, P.; Vitousek, S.; Ruggiero, P.; Barnard, P.; Erikson, L. H.; Tomas, A.; Mendez, F. J.

2016-12-01

Characterization of multimodal directional wave spectra is important for many offshore and coastal applications, such as marine forecasting, coastal hazard assessment, and design of offshore wave energy farms and coastal structures. However, the multivariate and multiscale nature of wave climate variability makes this problem complex yet tractable using computationally-expensive numerical models. So far, the skill of statistical-downscaling models based parametric (unimodal) wave conditions is limited in large ocean basins such as the Pacific. The recent availability of long-term directional spectral data from buoys and wave hindcast models allows for development of stochastic models that include multimodal sea-state parameters. This work introduces a statistical-downscaling framework based on weather types to predict multimodal wave spectra (e.g., significant wave height, mean wave period, and mean wave direction from different storm systems, including sea and swells) from large-scale atmospheric pressure fields. For each weather type, variables of interest are modeled using the categorical distribution for the sea-state type, the Generalized Extreme Value (GEV) distribution for wave height and wave period, a multivariate Gaussian copula for the interdependence between variables, and a Markov chain model for the chronology of daily weather types. We apply the model to the Southern California coast, where local seas and swells from both the Northern and Southern Hemispheres contribute to the multimodal wave spectrum. This work allows attribution of particular extreme multimodal wave events to specific atmospheric conditions, expanding knowledge of time-dependent, climate-driven offshore and coastal sea-state conditions that have a significant influence on local nearshore processes, coastal morphology, and flood hazards.
Self-consistent core-pedestal transport simulations with neural network accelerated models

DOE PAGES

Meneghini, Orso; Smith, Sterling P.; Snyder, Philip B.; ...

2017-07-12

Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflowmore » that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. Finally, the NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.« less
Self-consistent core-pedestal transport simulations with neural network accelerated models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meneghini, Orso; Smith, Sterling P.; Snyder, Philip B.

Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflowmore » that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. Finally, the NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.« less
Self-consistent core-pedestal transport simulations with neural network accelerated models

NASA Astrophysics Data System (ADS)

Meneghini, O.; Smith, S. P.; Snyder, P. B.; Staebler, G. M.; Candy, J.; Belli, E.; Lao, L.; Kostuk, M.; Luce, T.; Luda, T.; Park, J. M.; Poli, F.

2017-08-01

Fusion whole device modeling simulations require comprehensive models that are simultaneously physically accurate, fast, robust, and predictive. In this paper we describe the development of two neural-network (NN) based models as a means to perform a snon-linear multivariate regression of theory-based models for the core turbulent transport fluxes, and the pedestal structure. Specifically, we find that a NN-based approach can be used to consistently reproduce the results of the TGLF and EPED1 theory-based models over a broad range of plasma regimes, and with a computational speedup of several orders of magnitudes. These models are then integrated into a predictive workflow that allows prediction with self-consistent core-pedestal coupling of the kinetic profiles within the last closed flux surface of the plasma. The NN paradigm is capable of breaking the speed-accuracy trade-off that is expected of traditional numerical physics models, and can provide the missing link towards self-consistent coupled core-pedestal whole device modeling simulations that are physically accurate and yet take only seconds to run.
Estimation and model selection of semiparametric multivariate survival functions under general censorship.

PubMed

Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

2010-07-01

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Estimation and model selection of semiparametric multivariate survival functions under general censorship

PubMed Central

Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

2013-01-01

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
Usual Dietary Intakes: SAS Macros for Fitting Multivariate Measurement Error Models & Estimating Multivariate Usual Intake Distributions

Cancer.gov

The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.
Comparative Robustness of Recent Methods for Analyzing Multivariate Repeated Measures Designs

ERIC Educational Resources Information Center

Seco, Guillermo Vallejo; Gras, Jaime Arnau; Garcia, Manuel Ato

2007-01-01

This study evaluated the robustness of two recent methods for analyzing multivariate repeated measures when the assumptions of covariance homogeneity and multivariate normality are violated. Specifically, the authors' work compares the performance of the modified Brown-Forsythe (MBF) procedure and the mixed-model procedure adjusted by the…
Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction.

PubMed

Soleimani, Hossein; Hensman, James; Saria, Suchi

2017-08-21

Missing data and noisy observations pose significant challenges for reliably predicting events from irregularly sampled multivariate time series (longitudinal) data. Imputation methods, which are typically used for completing the data prior to event prediction, lack a principled mechanism to account for the uncertainty due to missingness. Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations. These approaches, however, make strong parametric assumptions and do not easily scale to multivariate signals with many observations. Our proposed approach consists of several key innovations. First, we develop a flexible and scalable joint model based upon sparse multiple-output Gaussian processes. Unlike state-of-the-art joint models, the proposed model can explain highly challenging structure including non-Gaussian noise while scaling to large data. Second, we derive an optimal policy for predicting events using the distribution of the event occurrence estimated by the joint model. The derived policy trades-off the cost of a delayed detection versus incorrect assessments and abstains from making decisions when the estimated event probability does not satisfy the derived confidence criteria. Experiments on a large dataset show that the proposed framework significantly outperforms state-of-the-art techniques in event prediction.
Multivariate Cryptography Based on Clipped Hopfield Neural Network.

PubMed

Wang, Jia; Cheng, Lee-Ming; Su, Tong

2018-02-01

Designing secure and efficient multivariate public key cryptosystems [multivariate cryptography (MVC)] to strengthen the security of RSA and ECC in conventional and quantum computational environment continues to be a challenging research in recent years. In this paper, we will describe multivariate public key cryptosystems based on extended Clipped Hopfield Neural Network (CHNN) and implement it using the MVC (CHNN-MVC) framework operated in space. The Diffie-Hellman key exchange algorithm is extended into the matrix field, which illustrates the feasibility of its new applications in both classic and postquantum cryptography. The efficiency and security of our proposed new public key cryptosystem CHNN-MVC are simulated and found to be NP-hard. The proposed algorithm will strengthen multivariate public key cryptosystems and allows hardware realization practicality.

Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data

NASA Technical Reports Server (NTRS)

Baker, W. E.; Bloom, S. C.; Nestler, M. S.

1985-01-01

A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.
Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

PubMed

Shen, Yanna; Cooper, Gregory F

2012-09-01

This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
PharmML in Action: an Interoperable Language for Modeling and Simulation.

PubMed

Bizzotto, R; Comets, E; Smith, G; Yvon, F; Kristensen, N R; Swat, M J

2017-10-01

PharmML is an XML-based exchange format created with a focus on nonlinear mixed-effect (NLME) models used in pharmacometrics, but providing a very general framework that also allows describing mathematical and statistical models such as single-subject or nonlinear and multivariate regression models. This tutorial provides an overview of the structure of this language, brief suggestions on how to work with it, and use cases demonstrating its power and flexibility. © 2017 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
SU-F-R-51: Radiomics in CT Perfusion Maps of Head and Neck Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nesteruk, M; Riesterer, O; Veit-Haibach, P

2016-06-15

Purpose: The aim of this study was to test the predictive value of radiomics features of CT perfusion (CTP) for tumor control, based on a preselection of radiomics features in a robustness study. Methods: 11 patients with head and neck cancer (HNC) and 11 patients with lung cancer were included in the robustness study to preselect stable radiomics parameters. Data from 36 HNC patients treated with definitive radiochemotherapy (median follow-up 30 months) was used to build a predictive model based on these parameters. All patients underwent pre-treatment CTP. 315 texture parameters were computed for three perfusion maps: blood volume, bloodmore » flow and mean transit time. The variability of texture parameters was tested with respect to non-standardizable perfusion computation factors (noise level and artery contouring) using intraclass correlation coefficients (ICC). The parameter with the highest ICC in the correlated group of parameters (inter-parameter Spearman correlations) was tested for its predictive value. The final model to predict tumor control was built using multivariate Cox regression analysis with backward selection of the variables. For comparison, a predictive model based on tumor volume was created. Results: Ten parameters were found to be stable in both HNC and lung cancer regarding potentially non-standardizable factors after the correction for inter-parameter correlations. In the multivariate backward selection of the variables, blood flow entropy showed a highly significant impact on tumor control (p=0.03) with concordance index (CI) of 0.76. Blood flow entropy was significantly lower in the patient group with controlled tumors at 18 months (p<0.1). The new model showed a higher concordance index compared to the tumor volume model (CI=0.68). Conclusion: The preselection of variables in the robustness study allowed building a predictive radiomics-based model of tumor control in HNC despite a small patient cohort. This model was found to be superior to the volume-based model. The project was supported by the KFSP Tumor Oxygenation of the University of Zurich, by a grant of the Center for Clinical Research, University and University Hospital Zurich and by a research grant from Merck (Schweiz) AG.« less
FACTOR ANALYTIC MODELS OF CLUSTERED MULTIVARIATE DATA WITH INFORMATIVE CENSORING

EPA Science Inventory

This paper describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censorin...
A study of pH-dependent photodegradation of amiloride by a multivariate curve resolution approach to combined kinetic and acid-base titration UV data.

PubMed

De Luca, Michele; Ioele, Giuseppina; Mas, Sílvia; Tauler, Romà; Ragno, Gaetano

2012-11-21

Amiloride photostability at different pH values was studied in depth by applying Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) to the UV spectrophotometric data from drug solutions exposed to stressing irradiation. Resolution of all degradation photoproducts was possible by simultaneous spectrophotometric analysis of kinetic photodegradation and acid-base titration experiments. Amiloride photodegradation showed to be strongly dependent on pH. Two hard modelling constraints were sequentially used in MCR-ALS for the unambiguous resolution of all the species involved in the photodegradation process. An amiloride acid-base system was defined by using the equilibrium constraint, and the photodegradation pathway was modelled taking into account the kinetic constraint. The simultaneous analysis of photodegradation and titration experiments revealed the presence of eight different species, which were differently distributed according to pH and time. Concentration profiles of all the species as well as their pure spectra were resolved and kinetic rate constants were estimated. The values of rate constants changed with pH and under alkaline conditions the degradation pathway and photoproducts also changed. These results were compared to those obtained by LC-MS analysis from drug photodegradation experiments. MS analysis allowed the identification of up to five species and showed the simultaneous presence of more than one acid-base equilibrium.
Incorporating Single-nucleotide Polymorphisms Into the Lyman Model to Improve Prediction of Radiation Pneumonitis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tucker, Susan L., E-mail: sltucker@mdanderson.org; Li Minghuan; Xu Ting

2013-01-01

Purpose: To determine whether single-nucleotide polymorphisms (SNPs) in genes associated with DNA repair, cell cycle, transforming growth factor-{beta}, tumor necrosis factor and receptor, folic acid metabolism, and angiogenesis can significantly improve the fit of the Lyman-Kutcher-Burman (LKB) normal-tissue complication probability (NTCP) model of radiation pneumonitis (RP) risk among patients with non-small cell lung cancer (NSCLC). Methods and Materials: Sixteen SNPs from 10 different genes (XRCC1, XRCC3, APEX1, MDM2, TGF{beta}, TNF{alpha}, TNFR, MTHFR, MTRR, and VEGF) were genotyped in 141 NSCLC patients treated with definitive radiation therapy, with or without chemotherapy. The LKB model was used to estimate the risk ofmore » severe (grade {>=}3) RP as a function of mean lung dose (MLD), with SNPs and patient smoking status incorporated into the model as dose-modifying factors. Multivariate analyses were performed by adding significant factors to the MLD model in a forward stepwise procedure, with significance assessed using the likelihood-ratio test. Bootstrap analyses were used to assess the reproducibility of results under variations in the data. Results: Five SNPs were selected for inclusion in the multivariate NTCP model based on MLD alone. SNPs associated with an increased risk of severe RP were in genes for TGF{beta}, VEGF, TNF{alpha}, XRCC1 and APEX1. With smoking status included in the multivariate model, the SNPs significantly associated with increased risk of RP were in genes for TGF{beta}, VEGF, and XRCC3. Bootstrap analyses selected a median of 4 SNPs per model fit, with the 6 genes listed above selected most often. Conclusions: This study provides evidence that SNPs can significantly improve the predictive ability of the Lyman MLD model. With a small number of SNPs, it was possible to distinguish cohorts with >50% risk vs <10% risk of RP when they were exposed to high MLDs.« less
An Examination of the Domain of Multivariable Functions Using the Pirie-Kieren Model

ERIC Educational Resources Information Center

Sengul, Sare; Yildiz, Sevda Goktepe

2016-01-01

The aim of this study is to employ the Pirie-Kieren model so as to examine the understandings relating to the domain of multivariable functions held by primary school mathematics preservice teachers. The data obtained was categorized according to Pirie-Kieren model and demonstrated visually in tables and bar charts. The study group consisted of…
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs

Treesearch

Andrew F. Howard; Daniel A. Yaussy

1986-01-01

A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
A Multivariate Model for the Meta-Analysis of Study Level Survival Data at Multiple Times

ERIC Educational Resources Information Center

Jackson, Dan; Rollins, Katie; Coughlin, Patrick

2014-01-01

Motivated by our meta-analytic dataset involving survival rates after treatment for critical leg ischemia, we develop and apply a new multivariate model for the meta-analysis of study level survival data at multiple times. Our data set involves 50 studies that provide mortality rates at up to seven time points, which we model simultaneously, and…
Analytical framework for reconstructing heterogeneous environmental variables from mammal community structure.

PubMed

Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C

2015-01-01

We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Support vector machine learning-based fMRI data group analysis.

PubMed

Wang, Ze; Childress, Anna R; Wang, Jiongjiong; Detre, John A

2007-07-15

To explore the multivariate nature of fMRI data and to consider the inter-subject brain response discrepancies, a multivariate and brain response model-free method is fundamentally required. Two such methods are presented in this paper by integrating a machine learning algorithm, the support vector machine (SVM), and the random effect model. Without any brain response modeling, SVM was used to extract a whole brain spatial discriminance map (SDM), representing the brain response difference between the contrasted experimental conditions. Population inference was then obtained through the random effect analysis (RFX) or permutation testing (PMU) on the individual subjects' SDMs. Applied to arterial spin labeling (ASL) perfusion fMRI data, SDM RFX yielded lower false-positive rates in the null hypothesis test and higher detection sensitivity for synthetic activations with varying cluster size and activation strengths, compared to the univariate general linear model (GLM)-based RFX. For a sensory-motor ASL fMRI study, both SDM RFX and SDM PMU yielded similar activation patterns to GLM RFX and GLM PMU, respectively, but with higher t values and cluster extensions at the same significance level. Capitalizing on the absence of temporal noise correlation in ASL data, this study also incorporated PMU in the individual-level GLM and SVM analyses accompanied by group-level analysis through RFX or group-level PMU. Providing inferences on the probability of being activated or deactivated at each voxel, these individual-level PMU-based group analysis methods can be used to threshold the analysis results of GLM RFX, SDM RFX or SDM PMU.
Detection and Attribution of Simulated Climatic Extreme Events and Impacts: High Sensitivity to Bias Correction

NASA Astrophysics Data System (ADS)

Sippel, S.; Otto, F. E. L.; Forkel, M.; Allen, M. R.; Guillod, B. P.; Heimann, M.; Reichstein, M.; Seneviratne, S. I.; Kirsten, T.; Mahecha, M. D.

2015-12-01

Understanding, quantifying and attributing the impacts of climatic extreme events and variability is crucial for societal adaptation in a changing climate. However, climate model simulations generated for this purpose typically exhibit pronounced biases in their output that hinders any straightforward assessment of impacts. To overcome this issue, various bias correction strategies are routinely used to alleviate climate model deficiencies most of which have been criticized for physical inconsistency and the non-preservation of the multivariate correlation structure. We assess how biases and their correction affect the quantification and attribution of simulated extremes and variability in i) climatological variables and ii) impacts on ecosystem functioning as simulated by a terrestrial biosphere model. Our study demonstrates that assessments of simulated climatic extreme events and impacts in the terrestrial biosphere are highly sensitive to bias correction schemes with major implications for the detection and attribution of these events. We introduce a novel ensemble-based resampling scheme based on a large regional climate model ensemble generated by the distributed weather@home setup[1], which fully preserves the physical consistency and multivariate correlation structure of the model output. We use extreme value statistics to show that this procedure considerably improves the representation of climatic extremes and variability. Subsequently, biosphere-atmosphere carbon fluxes are simulated using a terrestrial ecosystem model (LPJ-GSI) to further demonstrate the sensitivity of ecosystem impacts to the methodology of bias correcting climate model output. We find that uncertainties arising from bias correction schemes are comparable in magnitude to model structural and parameter uncertainties. The present study consists of a first attempt to alleviate climate model biases in a physically consistent way and demonstrates that this yields improved simulations of climate extremes and associated impacts. [1] http://www.climateprediction.net/weatherathome/
Two-part models with stochastic processes for modelling longitudinal semicontinuous data: Computationally efficient inference and modelling the overall marginal mean.

PubMed

Yiu, Sean; Tom, Brian Dm

2017-01-01

Several researchers have described two-part models with patient-specific stochastic processes for analysing longitudinal semicontinuous data. In theory, such models can offer greater flexibility than the standard two-part model with patient-specific random effects. However, in practice, the high dimensional integrations involved in the marginal likelihood (i.e. integrated over the stochastic processes) significantly complicates model fitting. Thus, non-standard computationally intensive procedures based on simulating the marginal likelihood have so far only been proposed. In this paper, we describe an efficient method of implementation by demonstrating how the high dimensional integrations involved in the marginal likelihood can be computed efficiently. Specifically, by using a property of the multivariate normal distribution and the standard marginal cumulative distribution function identity, we transform the marginal likelihood so that the high dimensional integrations are contained in the cumulative distribution function of a multivariate normal distribution, which can then be efficiently evaluated. Hence, maximum likelihood estimation can be used to obtain parameter estimates and asymptotic standard errors (from the observed information matrix) of model parameters. We describe our proposed efficient implementation procedure for the standard two-part model parameterisation and when it is of interest to directly model the overall marginal mean. The methodology is applied on a psoriatic arthritis data set concerning functional disability.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging.

PubMed

Naylor, Melissa G; Cardenas, Valerie A; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

2014-03-01

Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remain a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. Copyright © 2013 Wiley Periodicals, Inc.
Multivariate Analysis of Longitudinal Rates of Change

PubMed Central

Bryan, Matthew; Heagerty, Patrick J.

2016-01-01

Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed by Roy and Lin [1]; Proust-Lima, Letenneur and Jacqmin-Gadda [2]; and Gray and Brookmeyer [3] among others. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, Gray and Brookmeyer [3] introduce an “accelerated time” method which assumes that covariates rescale time in longitudinal models for disease progression. In this manuscript we detail an alternative multivariate model formulation that directly structures longitudinal rates of change, and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. PMID:27417129
Groundwater source contamination mechanisms: Physicochemical profile clustering, risk factor analysis and multivariate modelling

NASA Astrophysics Data System (ADS)

Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.

2014-04-01

An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.
Mapping information exposure on social media to explain differences in HPV vaccine coverage in the United States.

PubMed

Dunn, Adam G; Surian, Didi; Leask, Julie; Dey, Aditi; Mandl, Kenneth D; Coiera, Enrico

2017-05-25

Together with access, acceptance of vaccines affects human papillomavirus (HPV) vaccine coverage, yet little is known about media's role. Our aim was to determine whether measures of information exposure derived from Twitter could be used to explain differences in coverage in the United States. We conducted an analysis of exposure to information about HPV vaccines on Twitter, derived from 273.8 million exposures to 258,418 tweets posted between 1 October 2013 and 30 October 2015. Tweets were classified by topic using machine learning methods. Proportional exposure to each topic was used to construct multivariable models for predicting state-level HPV vaccine coverage, and compared to multivariable models constructed using socioeconomic factors: poverty, education, and insurance. Outcome measures included correlations between coverage and the individual topics and socioeconomic factors; and differences in the predictive performance of the multivariable models. Topics corresponding to media controversies were most closely correlated with coverage (both positively and negatively); education and insurance were highest among socioeconomic indicators. Measures of information exposure explained 68% of the variance in one dose 2015 HPV vaccine coverage in females (males: 63%). In comparison, models based on socioeconomic factors explained 42% of the variance in females (males: 40%). Measures of information exposure derived from Twitter explained differences in coverage that were not explained by socioeconomic factors. Vaccine coverage was lower in states where safety concerns, misinformation, and conspiracies made up higher proportions of exposures, suggesting that negative representations of vaccines in the media may reflect or influence vaccine acceptance. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Physiology-Based Modeling May Predict Surgical Treatment Outcome for Obstructive Sleep Apnea

PubMed Central

Li, Yanru; Ye, Jingying; Han, Demin; Cao, Xin; Ding, Xiu; Zhang, Yuhuan; Xu, Wen; Orr, Jeremy; Jen, Rachel; Sands, Scott; Malhotra, Atul; Owens, Robert

2017-01-01

Study Objectives: To test whether the integration of both anatomical and nonanatomical parameters (ventilatory control, arousal threshold, muscle responsiveness) in a physiology-based model will improve the ability to predict outcomes after upper airway surgery for obstructive sleep apnea (OSA). Methods: In 31 patients who underwent upper airway surgery for OSA, loop gain and arousal threshold were calculated from preoperative polysomnography (PSG). Three models were compared: (1) a multiple regression based on an extensive list of PSG parameters alone; (2) a multivariate regression using PSG parameters plus PSG-derived estimates of loop gain, arousal threshold, and other trait surrogates; (3) a physiological model incorporating selected variables as surrogates of anatomical and nonanatomical traits important for OSA pathogenesis. Results: Although preoperative loop gain was positively correlated with postoperative apnea-hypopnea index (AHI) (P = .008) and arousal threshold was negatively correlated (P = .011), in both model 1 and 2, the only significant variable was preoperative AHI, which explained 42% of the variance in postoperative AHI. In contrast, the physiological model (model 3), which included AHIREM (anatomy term), fraction of events that were hypopnea (arousal term), the ratio of AHIREM and AHINREM (muscle responsiveness term), loop gain, and central/mixed apnea index (control of breathing terms), was able to explain 61% of the variance in postoperative AHI. Conclusions: Although loop gain and arousal threshold are associated with residual AHI after surgery, only preoperative AHI was predictive using multivariate regression modeling. Instead, incorporating selected surrogates of physiological traits on the basis of OSA pathophysiology created a model that has more association with actual residual AHI. Commentary: A commentary on this article appears in this issue on page 1023. Clinical Trial Registration: ClinicalTrials.Gov; Title: The Impact of Sleep Apnea Treatment on Physiology Traits in Chinese Patients With Obstructive Sleep Apnea; Identifier: NCT02696629; URL: https://clinicaltrials.gov/show/NCT02696629 Citation: Li Y, Ye J, Han D, Cao X, Ding X, Zhang Y, Xu W, Orr J, Jen R, Sands S, Malhotra A, Owens R. Physiology-based modeling may predict surgical treatment outcome for obstructive sleep apnea. J Clin Sleep Med. 2017;13(9):1029–1037. PMID:28818154
Mathematical Formulation of Multivariate Euclidean Models for Discrimination Methods.

ERIC Educational Resources Information Center

Mullen, Kenneth; Ennis, Daniel M.

1987-01-01

Multivariate models for the triangular and duo-trio methods are described, and theoretical methods are compared to a Monte Carlo simulation. Implications are discussed for a new theory of multidimensional scaling which challenges the traditional assumption that proximity measures and perceptual distances are monotonically related. (Author/GDC)

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sample records for multivariate models based

Abstract