multivariate models showed: Topics by Science.gov

Sample records for multivariate models showed

Analyzing Multiple Outcomes in Clinical Research Using Multivariate Multilevel Models

PubMed Central

Baldwin, Scott A.; Imel, Zac E.; Braithwaite, Scott R.; Atkins, David C.

2014-01-01

Objective Multilevel models have become a standard data analysis approach in intervention research. Although the vast majority of intervention studies involve multiple outcome measures, few studies use multivariate analysis methods. The authors discuss multivariate extensions to the multilevel model that can be used by psychotherapy researchers. Method and Results Using simulated longitudinal treatment data, the authors show how multivariate models extend common univariate growth models and how the multivariate model can be used to examine multivariate hypotheses involving fixed effects (e.g., does the size of the treatment effect differ across outcomes?) and random effects (e.g., is change in one outcome related to change in the other?). An online supplemental appendix provides annotated computer code and simulated example data for implementing a multivariate model. Conclusions Multivariate multilevel models are flexible, powerful models that can enhance clinical research. PMID:24491071
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.

ERIC Educational Resources Information Center

Wood, Terry M.; Safrit, Margaret J.

1987-01-01

A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
A simplified parsimonious higher order multivariate Markov chain model

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, a simplified parsimonious higher-order multivariate Markov chain model (SPHOMMCM) is presented. Moreover, parameter estimation method of TPHOMMCM is give. Numerical experiments shows the effectiveness of TPHOMMCM.
Up-scaling of multi-variable flood loss models from objects to land use units at the meso-scale

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Schröter, Kai; Merz, Bruno

2016-05-01

Flood risk management increasingly relies on risk analyses, including loss modelling. Most of the flood loss models usually applied in standard practice have in common that complex damaging processes are described by simple approaches like stage-damage functions. Novel multi-variable models significantly improve loss estimation on the micro-scale and may also be advantageous for large-scale applications. However, more input parameters also reveal additional uncertainty, even more in upscaling procedures for meso-scale applications, where the parameters need to be estimated on a regional area-wide basis. To gain more knowledge about challenges associated with the up-scaling of multi-variable flood loss models the following approach is applied: Single- and multi-variable micro-scale flood loss models are up-scaled and applied on the meso-scale, namely on basis of ATKIS land-use units. Application and validation is undertaken in 19 municipalities, which were affected during the 2002 flood by the River Mulde in Saxony, Germany by comparison to official loss data provided by the Saxon Relief Bank (SAB).In the meso-scale case study based model validation, most multi-variable models show smaller errors than the uni-variable stage-damage functions. The results show the suitability of the up-scaling approach, and, in accordance with micro-scale validation studies, that multi-variable models are an improvement in flood loss modelling also on the meso-scale. However, uncertainties remain high, stressing the importance of uncertainty quantification. Thus, the development of probabilistic loss models, like BT-FLEMO used in this study, which inherently provide uncertainty information are the way forward.
Predictive model for falling in Parkinson disease patients.

PubMed

Custodio, Nilton; Lira, David; Herrera-Perez, Eder; Montesinos, Rosa; Castro-Suarez, Sheila; Cuenca-Alfaro, Jose; Cortijo, Patricia

2016-12-01

Falls are a common complication of advancing Parkinson's disease (PD). Although numerous risk factors are known, reliable predictors of future falls are still lacking. The aim of this study was to develop a multivariate model to predict falling in PD patients. Prospective cohort with forty-nine PD patients. The area under the receiver-operating characteristic curve (AUC) was calculated to evaluate predictive performance of the purposed multivariate model. The median of PD duration and UPDRS-III score in the cohort was 6 years and 24 points, respectively. Falls occurred in 18 PD patients (30%). Predictive factors for falling identified by univariate analysis were age, PD duration, physical activity, and scores of UPDRS motor, FOG, ACE, IFS, PFAQ and GDS ( p -value < 0.001), as well as fear of falling score ( p -value = 0.04). The final multivariate model (PD duration, FOG, ACE, and physical activity) showed an AUC = 0.9282 (correctly classified = 89.83%; sensitivity = 92.68%; specificity = 83.33%). This study showed that our multivariate model have a high performance to predict falling in a sample of PD patients.
Multivariate spatial models of excess crash frequency at area level: case of Costa Rica.

PubMed

Aguero-Valverde, Jonathan

2013-10-01

Recently, areal models of crash frequency have being used in the analysis of various area-wide factors affecting road crashes. On the other hand, disease mapping methods are commonly used in epidemiology to assess the relative risk of the population at different spatial units. A natural next step is to combine these two approaches to estimate the excess crash frequency at area level as a measure of absolute crash risk. Furthermore, multivariate spatial models of crash severity are explored in order to account for both frequency and severity of crashes and control for the spatial correlation frequently found in crash data. This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency. A multivariate spatial model is used for that purpose and compared to its univariate counterpart. Full Bayes hierarchical approach is used to estimate the models of crash frequency at canton level for Costa Rica. An intrinsic multivariate conditional autoregressive model is used for modeling spatial random effects. The results show that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria. Additionally, the effects of the spatial smoothing due to the multivariate spatial random effects are evident in the estimation of excess equivalent property damage only crashes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy.

PubMed

Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L

2017-05-07

In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Esophageal wall dose-surface maps do not improve the predictive performance of a multivariable NTCP model for acute esophageal toxicity in advanced stage NSCLC patients treated with intensity-modulated (chemo-)radiotherapy

NASA Astrophysics Data System (ADS)

Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.

2017-05-01

In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Insights on multivariate updates of physical and biogeochemical ocean variables using an Ensemble Kalman Filter and an idealized model of upwelling

NASA Astrophysics Data System (ADS)

Yu, Liuqian; Fennel, Katja; Bertino, Laurent; Gharamti, Mohamad El; Thompson, Keith R.

2018-06-01

Effective data assimilation methods for incorporating observations into marine biogeochemical models are required to improve hindcasts, nowcasts and forecasts of the ocean's biogeochemical state. Recent assimilation efforts have shown that updating model physics alone can degrade biogeochemical fields while only updating biogeochemical variables may not improve a model's predictive skill when the physical fields are inaccurate. Here we systematically investigate whether multivariate updates of physical and biogeochemical model states are superior to only updating either physical or biogeochemical variables. We conducted a series of twin experiments in an idealized ocean channel that experiences wind-driven upwelling. The forecast model was forced with biased wind stress and perturbed biogeochemical model parameters compared to the model run representing the "truth". Taking advantage of the multivariate nature of the deterministic Ensemble Kalman Filter (DEnKF), we assimilated different combinations of synthetic physical (sea surface height, sea surface temperature and temperature profiles) and biogeochemical (surface chlorophyll and nitrate profiles) observations. We show that when biogeochemical and physical properties are highly correlated (e.g., thermocline and nutricline), multivariate updates of both are essential for improving model skill and can be accomplished by assimilating either physical (e.g., temperature profiles) or biogeochemical (e.g., nutrient profiles) observations. In our idealized domain, the improvement is largely due to a better representation of nutrient upwelling, which results in a more accurate nutrient input into the euphotic zone. In contrast, assimilating surface chlorophyll improves the model state only slightly, because surface chlorophyll contains little information about the vertical density structure. We also show that a degradation of the correlation between observed subsurface temperature and nutrient fields, which has been an issue in several previous assimilation studies, can be reduced by multivariate updates of physical and biogeochemical fields.
Multivariable model predictive control design of reactive distillation column for Dimethyl Ether production

NASA Astrophysics Data System (ADS)

Wahid, A.; Putra, I. G. E. P.

2018-03-01

Dimethyl ether (DME) as an alternative clean energy has attracted a growing attention in the recent years. DME production via reactive distillation has potential for capital cost and energy requirement savings. However, combination of reaction and distillation on a single column makes reactive distillation process a very complex multivariable system with high non-linearity of process and strong interaction between process variables. This study investigates a multivariable model predictive control (MPC) based on two-point temperature control strategy for the DME reactive distillation column to maintain the purities of both product streams. The process model is estimated by a first order plus dead time model. The DME and water purity is maintained by controlling a stage temperature in rectifying and stripping section, respectively. The result shows that the model predictive controller performed faster responses compared to conventional PI controller that are showed by the smaller ISE values. In addition, the MPC controller is able to handle the loop interactions well.
Studying Resist Stochastics with the Multivariate Poisson Propagation Model

DOE PAGES

Naulleau, Patrick; Anderson, Christopher; Chao, Weilun; ...

2014-01-01

Progress in the ultimate performance of extreme ultraviolet resist has arguably decelerated in recent years suggesting an approach to stochastic limits both in photon counts and material parameters. Here we report on the performance of a variety of leading extreme ultraviolet resist both with and without chemical amplification. The measured performance is compared to stochastic modeling results using the Multivariate Poisson Propagation Model. The results show that the best materials are indeed nearing modeled performance limits.
Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

NASA Astrophysics Data System (ADS)

Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

2009-08-01

In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.
Multivariate optical computing using a digital micromirror device for fluorescence and Raman spectroscopy.

PubMed

Smith, Zachary J; Strombom, Sven; Wachsmann-Hogiu, Sebastian

2011-08-29

A multivariate optical computer has been constructed consisting of a spectrograph, digital micromirror device, and photomultiplier tube that is capable of determining absolute concentrations of individual components of a multivariate spectral model. We present experimental results on ternary mixtures, showing accurate quantification of chemical concentrations based on integrated intensities of fluorescence and Raman spectra measured with a single point detector. We additionally show in simulation that point measurements based on principal component spectra retain the ability to classify cancerous from noncancerous T cells.
Order-restricted inference for multivariate longitudinal data with applications to the natural history of hearing loss.

PubMed

Rosen, Sophia; Davidov, Ori

2012-07-20

Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associated with a multivariate longitudinal outcome. The multivariate mixed-effects model is a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, it is known that hearing thresholds, at every frequency, increase with age. Moreover, this age-related threshold elevation is monotone in frequency, that is, the higher the frequency, the higher, on average, is the rate of threshold elevation. This means that there is a natural ordering among the different frequencies in the rate of hearing loss. In practice, this amounts to imposing a set of constraints on the different frequencies' regression coefficients modeling the mean effect of time and age at entry to the study on hearing thresholds. The aforementioned constraints should be accounted for in the analysis. The result is a multivariate longitudinal model with restricted parameters. We propose estimation and testing procedures for such models. We show that ignoring the constraints may lead to misleading inferences regarding the direction and the magnitude of various effects. Moreover, simulations show that incorporating the constraints substantially improves the mean squared error of the estimates and the power of the tests. We used this methodology to analyze a real hearing loss study. Copyright © 2012 John Wiley & Sons, Ltd.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2012-01-01

Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Applying the multivariate time-rescaling theorem to neural population models

PubMed Central

Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon

2011-01-01

Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436
Multivariate analysis of longitudinal rates of change.

PubMed

Bryan, Matthew; Heagerty, Patrick J

2016-12-10

Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed in the literature. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, 'accelerated time' methods have been developed which assume that covariates rescale time in longitudinal models for disease progression. In this manuscript, we detail an alternative multivariate model formulation that directly structures longitudinal rates of change and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging

PubMed Central

Naylor, Melissa G.; Cardenas, Valerie A.; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

2015-01-01

Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remains a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. PMID:23408378
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

NASA Astrophysics Data System (ADS)

Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

2016-03-01

From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

PubMed Central

Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

2016-01-01

From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254

A comparison of bivariate, multivariate random-effects, and Poisson correlated gamma-frailty models to meta-analyze individual patient data of ordinal scale diagnostic tests.

PubMed

Simoneau, Gabrielle; Levis, Brooke; Cuijpers, Pim; Ioannidis, John P A; Patten, Scott B; Shrier, Ian; Bombardier, Charles H; de Lima Osório, Flavia; Fann, Jesse R; Gjerdingen, Dwenda; Lamers, Femke; Lotrakul, Manote; Löwe, Bernd; Shaaban, Juwita; Stafford, Lesley; van Weert, Henk C P M; Whooley, Mary A; Wittkampf, Karin A; Yeung, Albert S; Thombs, Brett D; Benedetti, Andrea

2017-11-01

Individual patient data (IPD) meta-analyses are increasingly common in the literature. In the context of estimating the diagnostic accuracy of ordinal or semi-continuous scale tests, sensitivity and specificity are often reported for a given threshold or a small set of thresholds, and a meta-analysis is conducted via a bivariate approach to account for their correlation. When IPD are available, sensitivity and specificity can be pooled for every possible threshold. Our objective was to compare the bivariate approach, which can be applied separately at every threshold, to two multivariate methods: the ordinal multivariate random-effects model and the Poisson correlated gamma-frailty model. Our comparison was empirical, using IPD from 13 studies that evaluated the diagnostic accuracy of the 9-item Patient Health Questionnaire depression screening tool, and included simulations. The empirical comparison showed that the implementation of the two multivariate methods is more laborious in terms of computational time and sensitivity to user-supplied values compared to the bivariate approach. Simulations showed that ignoring the within-study correlation of sensitivity and specificity across thresholds did not worsen inferences with the bivariate approach compared to the Poisson model. The ordinal approach was not suitable for simulations because the model was highly sensitive to user-supplied starting values. We tentatively recommend the bivariate approach rather than more complex multivariate methods for IPD diagnostic accuracy meta-analyses of ordinal scale tests, although the limited type of diagnostic data considered in the simulation study restricts the generalization of our findings. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Applying Multivariate Discrete Distributions to Genetically Informative Count Data.

PubMed

Kirkpatrick, Robert M; Neale, Michael C

2016-03-01

We present a novel method of conducting biometric analysis of twin data when the phenotypes are integer-valued counts, which often show an L-shaped distribution. Monte Carlo simulation is used to compare five likelihood-based approaches to modeling: our multivariate discrete method, when its distributional assumptions are correct, when they are incorrect, and three other methods in common use. With data simulated from a skewed discrete distribution, recovery of twin correlations and proportions of additive genetic and common environment variance was generally poor for the Normal, Lognormal and Ordinal models, but good for the two discrete models. Sex-separate applications to substance-use data from twins in the Minnesota Twin Family Study showed superior performance of two discrete models. The new methods are implemented using R and OpenMx and are freely available.
Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

PubMed

MacNab, Ying C

2016-08-01

This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.
Multivariate modelling of endophenotypes associated with the metabolic syndrome in Chinese twins.

PubMed

Pang, Z; Zhang, D; Li, S; Duan, H; Hjelmborg, J; Kruse, T A; Kyvik, K O; Christensen, K; Tan, Q

2010-12-01

The common genetic and environmental effects on endophenotypes related to the metabolic syndrome have been investigated using bivariate and multivariate twin models. This paper extends the pairwise analysis approach by introducing independent and common pathway models to Chinese twin data. The aim was to explore the common genetic architecture in the development of these phenotypes in the Chinese population. Three multivariate models including the full saturated Cholesky decomposition model, the common factor independent pathway model and the common factor common pathway model were fitted to 695 pairs of Chinese twins representing six phenotypes including BMI, total cholesterol, total triacylglycerol, fasting glucose, HDL and LDL. Performances of the nested models were compared with that of the full Cholesky model. Cross-phenotype correlation coefficients gave clear indication of common genetic or environmental backgrounds in the phenotypes. Decomposition of phenotypic correlation by the Cholesky model revealed that the observed phenotypic correlation among lipid phenotypes had genetic and unique environmental backgrounds. Both pathway models suggest a common genetic architecture for lipid phenotypes, which is distinct from that of the non-lipid phenotypes. The declining performance with model restriction indicates biological heterogeneity in development among some of these phenotypes. Our multivariate analyses revealed common genetic and environmental backgrounds for the studied lipid phenotypes in Chinese twins. Model performance showed that physiologically distinct endophenotypes may follow different genetic regulations.
An assessment on the use of bivariate, multivariate and soft computing techniques for collapse susceptibility in GIS environ

NASA Astrophysics Data System (ADS)

Yilmaz, Işik; Marschalko, Marian; Bednarik, Martin

2013-04-01

The paper presented herein compares and discusses the use of bivariate, multivariate and soft computing techniques for collapse susceptibility modelling. Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index (TWI), stream power index (SPI), Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from the models, and they were then compared by means of their validations. However, Area Under Curve (AUC) values obtained from all three models showed that the map obtained from soft computing (ANN) model looks like more accurate than the other models, accuracies of all three models can be evaluated relatively similar. The results also showed that the conditional probability is an essential method in preparation of collapse susceptibility map and highly compatible with GIS operating features.
Multivariate Time Series Decomposition into Oscillation Components.

PubMed

Matsuda, Takeru; Komaki, Fumiyasu

2017-08-01

Many time series are considered to be a superposition of several oscillation components. We have proposed a method for decomposing univariate time series into oscillation components and estimating their phases (Matsuda & Komaki, 2017 ). In this study, we extend that method to multivariate time series. We assume that several oscillators underlie the given multivariate time series and that each variable corresponds to a superposition of the projections of the oscillators. Thus, the oscillators superpose on each variable with amplitude and phase modulation. Based on this idea, we develop gaussian linear state-space models and use them to decompose the given multivariate time series. The model parameters are estimated from data using the empirical Bayes method, and the number of oscillators is determined using the Akaike information criterion. Therefore, the proposed method extracts underlying oscillators in a data-driven manner and enables investigation of phase dynamics in a given multivariate time series. Numerical results show the effectiveness of the proposed method. From monthly mean north-south sunspot number data, the proposed method reveals an interesting phase relationship.
Multivariate meta-analysis using individual participant data

PubMed Central

Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.

2016-01-01

When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
Modelling world gold prices and USD foreign exchange relationship using multivariate GARCH model

NASA Astrophysics Data System (ADS)

Ping, Pung Yean; Ahmad, Maizah Hura Binti

2014-12-01

World gold price is a popular investment commodity. The series have often been modeled using univariate models. The objective of this paper is to show that there is a co-movement between gold price and USD foreign exchange rate. Using the effect of the USD foreign exchange rate on the gold price, a model that can be used to forecast future gold prices is developed. For this purpose, the current paper proposes a multivariate GARCH (Bivariate GARCH) model. Using daily prices of both series from 01.01.2000 to 05.05.2014, a causal relation between the two series understudied are found and a bivariate GARCH model is produced.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging.

PubMed

Naylor, Melissa G; Cardenas, Valerie A; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin

2014-03-01

Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remain a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. Copyright © 2013 Wiley Periodicals, Inc.
Multivariate Analysis of Longitudinal Rates of Change

PubMed Central

Bryan, Matthew; Heagerty, Patrick J.

2016-01-01

Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed by Roy and Lin [1]; Proust-Lima, Letenneur and Jacqmin-Gadda [2]; and Gray and Brookmeyer [3] among others. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, Gray and Brookmeyer [3] introduce an “accelerated time” method which assumes that covariates rescale time in longitudinal models for disease progression. In this manuscript we detail an alternative multivariate model formulation that directly structures longitudinal rates of change, and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. PMID:27417129
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES

PubMed Central

He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.

2017-01-01

Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
Robustness of reduced-order multivariable state-space self-tuning controller

NASA Technical Reports Server (NTRS)

Yuan, Zhuzhi; Chen, Zengqiang

1994-01-01

In this paper, we present a quantitative analysis of the robustness of a reduced-order pole-assignment state-space self-tuning controller for a multivariable adaptive control system whose order of the real process is higher than that of the model used in the controller design. The result of stability analysis shows that, under a specific bounded modelling error, the adaptively controlled closed-loop real system via the reduced-order state-space self-tuner is BIBO stable in the presence of unmodelled dynamics.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.

PubMed

Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin

2018-03-08

The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
Efficient Global Aerodynamic Modeling from Flight Data

NASA Technical Reports Server (NTRS)

Morelli, Eugene A.

2012-01-01

A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
User Selection Criteria of Airspace Designs in Flexible Airspace Management

NASA Technical Reports Server (NTRS)

Lee, Hwasoo E.; Lee, Paul U.; Jung, Jaewoo; Lai, Chok Fung

2011-01-01

A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
Multivariate random-parameters zero-inflated negative binomial regression model: an application to estimate crash frequencies at intersections.

PubMed

Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan

2014-09-01

Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
Determination of rice syrup adulterant concentration in honey using three-dimensional fluorescence spectra and multivariate calibrations

NASA Astrophysics Data System (ADS)

Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen

2014-10-01

To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
SMURC: High-Dimension Small-Sample Multivariate Regression With Covariance Estimation.

PubMed

Bayar, Belhassen; Bouaynaya, Nidhal; Shterenberg, Roman

2017-03-01

We consider a high-dimension low sample-size multivariate regression problem that accounts for correlation of the response variables. The system is underdetermined as there are more parameters than samples. We show that the maximum likelihood approach with covariance estimation is senseless because the likelihood diverges. We subsequently propose a normalization of the likelihood function that guarantees convergence. We call this method small-sample multivariate regression with covariance (SMURC) estimation. We derive an optimization problem and its convex approximation to compute SMURC. Simulation results show that the proposed algorithm outperforms the regularized likelihood estimator with known covariance matrix and the sparse conditional Gaussian graphical model. We also apply SMURC to the inference of the wing-muscle gene network of the Drosophila melanogaster (fruit fly).
Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function.

PubMed

Yang, James J; Williams, L Keoki; Buu, Anne

2017-08-24

A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference: an application to longitudinal modeling.

PubMed

Heggeseth, Brianna C; Jewell, Nicholas P

2013-07-20

Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independence-that is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show that it outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright © 2013 John Wiley & Sons, Ltd.

Multivariate meta-analysis using individual participant data.

PubMed

Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R

2015-06-01

When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.
Bayesian inference on risk differences: an application to multivariate meta-analysis of adverse events in clinical trials.

PubMed

Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng

2013-05-01

Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China

PubMed Central

Pei, Ling-Ling; Li, Qin

2018-01-01

The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985
Modeling and Simulation of Upset-Inducing Disturbances for Digital Systems in an Electromagnetic Reverberation Chamber

NASA Technical Reports Server (NTRS)

Torres-Pomales, Wilfredo

2014-01-01

This report describes a modeling and simulation approach for disturbance patterns representative of the environment experienced by a digital system in an electromagnetic reverberation chamber. The disturbance is modeled by a multi-variate statistical distribution based on empirical observations. Extended versions of the Rejection Samping and Inverse Transform Sampling techniques are developed to generate multi-variate random samples of the disturbance. The results show that Inverse Transform Sampling returns samples with higher fidelity relative to the empirical distribution. This work is part of an ongoing effort to develop a resilience assessment methodology for complex safety-critical distributed systems.
Kernel canonical-correlation Granger causality for multiple time series

NASA Astrophysics Data System (ADS)

Wu, Guorong; Duan, Xujun; Liao, Wei; Gao, Qing; Chen, Huafu

2011-04-01

Canonical-correlation analysis as a multivariate statistical technique has been applied to multivariate Granger causality analysis to infer information flow in complex systems. It shows unique appeal and great superiority over the traditional vector autoregressive method, due to the simplified procedure that detects causal interaction between multiple time series, and the avoidance of potential model estimation problems. However, it is limited to the linear case. Here, we extend the framework of canonical correlation to include the estimation of multivariate nonlinear Granger causality for drawing inference about directed interaction. Its feasibility and effectiveness are verified on simulated data.
A new multivariate zero-adjusted Poisson model with applications to biomedicine.

PubMed

Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

2018-05-25

Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Predicting crash frequency for multi-vehicle collision types using multivariate Poisson-lognormal spatial model: A comparative analysis.

PubMed

Hosseinpour, Mehdi; Sahebi, Sina; Zamzuri, Zamira Hasanah; Yahaya, Ahmad Shukri; Ismail, Noriszura

2018-06-01

According to crash configuration and pre-crash conditions, traffic crashes are classified into different collision types. Based on the literature, multi-vehicle crashes, such as head-on, rear-end, and angle crashes, are more frequent than single-vehicle crashes, and most often result in serious consequences. From a methodological point of view, the majority of prior studies focused on multivehicle collisions have employed univariate count models to estimate crash counts separately by collision type. However, univariate models fail to account for correlations which may exist between different collision types. Among others, multivariate Poisson lognormal (MVPLN) model with spatial correlation is a promising multivariate specification because it not only allows for unobserved heterogeneity (extra-Poisson variation) and dependencies between collision types, but also spatial correlation between adjacent sites. However, the MVPLN spatial model has rarely been applied in previous research for simultaneously modelling crash counts by collision type. Therefore, this study aims at utilizing a MVPLN spatial model to estimate crash counts for four different multi-vehicle collision types, including head-on, rear-end, angle, and sideswipe collisions. To investigate the performance of the MVPLN spatial model, a two-stage model and a univariate Poisson lognormal model (UNPLN) spatial model were also developed in this study. Detailed information on roadway characteristics, traffic volume, and crash history were collected on 407 homogeneous segments from Malaysian federal roads. The results indicate that the MVPLN spatial model outperforms the other comparing models in terms of goodness-of-fit measures. The results also show that the inclusion of spatial heterogeneity in the multivariate model significantly improves the model fit, as indicated by the Deviance Information Criterion (DIC). The correlation between crash types is high and positive, implying that the occurrence of a specific collision type is highly associated with the occurrence of other crash types on the same road segment. These results support the utilization of the MVPLN spatial model when predicting crash counts by collision manner. In terms of contributing factors, the results show that distinct crash types are attributed to different subsets of explanatory variables. Copyright © 2018 Elsevier Ltd. All rights reserved.
Methodological challenges to multivariate syndromic surveillance: a case study using Swiss animal health data.

PubMed

Vial, Flavie; Wei, Wei; Held, Leonhard

2016-12-20

In an era of ubiquitous electronic collection of animal health data, multivariate surveillance systems (which concurrently monitor several data streams) should have a greater probability of detecting disease events than univariate systems. However, despite their limitations, univariate aberration detection algorithms are used in most active syndromic surveillance (SyS) systems because of their ease of application and interpretation. On the other hand, a stochastic modelling-based approach to multivariate surveillance offers more flexibility, allowing for the retention of historical outbreaks, for overdispersion and for non-stationarity. While such methods are not new, they are yet to be applied to animal health surveillance data. We applied an example of such stochastic model, Held and colleagues' two-component model, to two multivariate animal health datasets from Switzerland. In our first application, multivariate time series of the number of laboratories test requests were derived from Swiss animal diagnostic laboratories. We compare the performance of the two-component model to parallel monitoring using an improved Farrington algorithm and found both methods yield a satisfactorily low false alarm rate. However, the calibration test of the two-component model on the one-step ahead predictions proved satisfactory, making such an approach suitable for outbreak prediction. In our second application, the two-component model was applied to the multivariate time series of the number of cattle abortions and the number of test requests for bovine viral diarrhea (a disease that often results in abortions). We found that there is a two days lagged effect from the number of abortions to the number of test requests. We further compared the joint modelling and univariate modelling of the number of laboratory test requests time series. The joint modelling approach showed evidence of superiority in terms of forecasting abilities. Stochastic modelling approaches offer the potential to address more realistic surveillance scenarios through, for example, the inclusion of times series specific parameters, or of covariates known to have an impact on syndrome counts. Nevertheless, many methodological challenges to multivariate surveillance of animal SyS data still remain. Deciding on the amount of corroboration among data streams that is required to escalate into an alert is not a trivial task given the sparse data on the events under consideration (e.g. disease outbreaks).
Membrane Introduction Mass Spectrometry Combined with an Orthogonal Partial-Least Squares Calibration Model for Mixture Analysis.

PubMed

Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu

2017-01-01

The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Univariate and multivariate spatial models of health facility utilisation for childhood fevers in an area on the coast of Kenya.

PubMed

Ouma, Paul O; Agutu, Nathan O; Snow, Robert W; Noor, Abdisalan M

2017-09-18

Precise quantification of health service utilisation is important for the estimation of disease burden and allocation of health resources. Current approaches to mapping health facility utilisation rely on spatial accessibility alone as the predictor. However, other spatially varying social, demographic and economic factors may affect the use of health services. The exclusion of these factors can lead to the inaccurate estimation of health facility utilisation. Here, we compare the accuracy of a univariate spatial model, developed only from estimated travel time, to a multivariate model that also includes relevant social, demographic and economic factors. A theoretical surface of travel time to the nearest public health facility was developed. These were assigned to each child reported to have had fever in the Kenya demographic and health survey of 2014 (KDHS 2014). The relationship of child treatment seeking for fever with travel time, household and individual factors from the KDHS2014 were determined using multilevel mixed modelling. Bayesian information criterion (BIC) and likelihood ratio test (LRT) tests were carried out to measure how selected factors improve parsimony and goodness of fit of the time model. Using the mixed model, a univariate spatial model of health facility utilisation was fitted using travel time as the predictor. The mixed model was also used to compute a multivariate spatial model of utilisation, using travel time and modelled surfaces of selected household and individual factors as predictors. The univariate and multivariate spatial models were then compared using the receiver operating area under the curve (AUC) and a percent correct prediction (PCP) test. The best fitting multivariate model had travel time, household wealth index and number of children in household as the predictors. These factors reduced BIC of the time model from 4008 to 2959, a change which was confirmed by the LRT test. Although there was a high correlation of the two modelled probability surfaces (Adj R 2 = 88%), the multivariate model had better AUC compared to the univariate model; 0.83 versus 0.73 and PCP 0.61 versus 0.45 values. Our study shows that a model that uses travel time, as well as household and individual-level socio-demographic factors, results in a more accurate estimation of use of health facilities for the treatment of childhood fever, compared to one that relies on only travel time.
Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

PubMed Central

Williams, L. Keoki; Buu, Anne

2017-01-01

We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
Association of educational status with cardiovascular disease: Teheran Lipid and Glucose Study.

PubMed

Hajsheikholeslami, Farhad; Hatami, Masumeh; Hadaegh, Farzad; Ghanbarian, Arash; Azizi, Fereidoun

2011-06-01

The aim of this study was to evaluate the associations between educational level and cardiovascular disease (CVD) in an older Iranian population. To estimate the odds ratio (OR) of educational level in a cross-sectional study, logistic regression analysis was used on 1,788 men and 2,204 women (222 men and 204 women positive based on their CVD status) aged ≥ 45 years. In men, educational levels of college degree and literacy level below diploma were inversely associated with CVD in the multivariate model [0.52 (0.28-0.94), 0.61 (0.40-0.92), respectively], but diploma level did not show any significant association with CVD, neither in the crude model nor in the multivariate model. In women, increase in educational level was inversely associated with risk of CVD in the crude model, but in the multivariate adjusted model, literacy level below diploma decreased risk of CVD by 39%, compared with illiteracy. Our findings support those of developed countries that, along with other CVD risk factors, educational status has an inverse association with CVD among a representative Iranian population of older men and women.
Multivariate dynamic Tobit models with lagged observed dependent variables: An effectiveness analysis of highway safety laws.

PubMed

Dong, Chunjiao; Xie, Kun; Zeng, Jin; Li, Xia

2018-04-01

Highway safety laws aim to influence driver behaviors so as to reduce the frequency and severity of crashes, and their outcomes. For one specific highway safety law, it would have different effects on the crashes across severities. Understanding such effects can help policy makers upgrade current laws and hence improve traffic safety. To investigate the effects of highway safety laws on crashes across severities, multivariate models are needed to account for the interdependency issues in crash counts across severities. Based on the characteristics of the dependent variables, multivariate dynamic Tobit (MVDT) models are proposed to analyze crash counts that are aggregated at the state level. Lagged observed dependent variables are incorporated into the MVDT models to account for potential temporal correlation issues in crash data. The state highway safety law related factors are used as the explanatory variables and socio-demographic and traffic factors are used as the control variables. Three models, a MVDT model with lagged observed dependent variables, a MVDT model with unobserved random variables, and a multivariate static Tobit (MVST) model are developed and compared. The results show that among the investigated models, the MVDT models with lagged observed dependent variables have the best goodness-of-fit. The findings indicate that, compared to the MVST, the MVDT models have better explanatory power and prediction accuracy. The MVDT model with lagged observed variables can better handle the stochasticity and dependency in the temporal evolution of the crash counts and the estimated values from the model are closer to the observed values. The results show that more lives could be saved if law enforcement agencies can make a sustained effort to educate the public about the importance of motorcyclists wearing helmets. Motor vehicle crash-related deaths, injuries, and property damages could be reduced if states enact laws for stricter text messaging rules, higher speeding fines, older licensing age, and stronger graduated licensing provisions. Injury and PDO crashes would be significantly reduced with stricter laws prohibiting the use of hand-held communication devices and higher fines for drunk driving. Copyright © 2018 Elsevier Ltd. All rights reserved.
Multivariate Strategies in Functional Magnetic Resonance Imaging

ERIC Educational Resources Information Center

Hansen, Lars Kai

2007-01-01

We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a "mind reading" predictive multivariate fMRI model.
Investigating College and Graduate Students' Multivariable Reasoning in Computational Modeling

ERIC Educational Resources Information Center

Wu, Hsin-Kai; Wu, Pai-Hsing; Zhang, Wen-Xin; Hsu, Ying-Shao

2013-01-01

Drawing upon the literature in computational modeling, multivariable reasoning, and causal attribution, this study aims at characterizing multivariable reasoning practices in computational modeling and revealing the nature of understanding about multivariable causality. We recruited two freshmen, two sophomores, two juniors, two seniors, four…
A Multivariate Model for the Study of Parental Acceptance-Rejection and Child Abuse.

ERIC Educational Resources Information Center

Rohner, Ronald P.; Rohner, Evelyn C.

This paper proposes a multivariate strategy for the study of parental acceptance-rejection and child abuse and describes a research study on parental rejection and child abuse which illustrates the advantages of using a multivariate, (rather than a simple-model) approach. The multivariate model is a combination of three simple models used to study…
Multiscale analysis of information dynamics for linear multivariate processes.

PubMed

Faes, Luca; Montalto, Alessandro; Stramaglia, Sebastiano; Nollo, Giandomenico; Marinazzo, Daniele

2016-08-01

In the study of complex physical and physiological systems represented by multivariate time series, an issue of great interest is the description of the system dynamics over a range of different temporal scales. While information-theoretic approaches to the multiscale analysis of complex dynamics are being increasingly used, the theoretical properties of the applied measures are poorly understood. This study introduces for the first time a framework for the analytical computation of information dynamics for linear multivariate stochastic processes explored at different time scales. After showing that the multiscale processing of a vector autoregressive (VAR) process introduces a moving average (MA) component, we describe how to represent the resulting VARMA process using statespace (SS) models and how to exploit the SS model parameters to compute analytical measures of information storage and information transfer for the original and rescaled processes. The framework is then used to quantify multiscale information dynamics for simulated unidirectionally and bidirectionally coupled VAR processes, showing that rescaling may lead to insightful patterns of information storage and transfer but also to potentially misleading behaviors.
[Monitoring method of extraction process for Schisandrae Chinensis Fructus based on near infrared spectroscopy and multivariate statistical process control].

PubMed

Xu, Min; Zhang, Lei; Yue, Hong-Shui; Pang, Hong-Wei; Ye, Zheng-Liang; Ding, Li

2017-10-01

To establish an on-line monitoring method for extraction process of Schisandrae Chinensis Fructus, the formula medicinal material of Yiqi Fumai lyophilized injection by combining near infrared spectroscopy with multi-variable data analysis technology. The multivariate statistical process control (MSPC) model was established based on 5 normal batches in production and 2 test batches were monitored by PC scores, DModX and Hotelling T2 control charts. The results showed that MSPC model had a good monitoring ability for the extraction process. The application of the MSPC model to actual production process could effectively achieve on-line monitoring for extraction process of Schisandrae Chinensis Fructus, and can reflect the change of material properties in the production process in real time. This established process monitoring method could provide reference for the application of process analysis technology in the process quality control of traditional Chinese medicine injections. Copyright© by the Chinese Pharmaceutical Association.
Extensions to Multivariate Space Time Mixture Modeling of Small Area Cancer Data.

PubMed

Carroll, Rachel; Lawson, Andrew B; Faes, Christel; Kirby, Russell S; Aregay, Mehreteab; Watjou, Kevin

2017-05-09

Oral cavity and pharynx cancer, even when considered together, is a fairly rare disease. Implementation of multivariate modeling with lung and bronchus cancer, as well as melanoma cancer of the skin, could lead to better inference for oral cavity and pharynx cancer. The multivariate structure of these models is accomplished via the use of shared random effects, as well as other multivariate prior distributions. The results in this paper indicate that care should be taken when executing these types of models, and that multivariate mixture models may not always be the ideal option, depending on the data of interest.
PHI and PCA3 improve the prognostic performance of PRIAS and Epstein criteria in predicting insignificant prostate cancer in men eligible for active surveillance.

PubMed

Cantiello, Francesco; Russo, Giorgio Ivan; Cicione, Antonio; Ferro, Matteo; Cimino, Sebastiano; Favilla, Vincenzo; Perdonà, Sisto; De Cobelli, Ottavio; Magno, Carlo; Morgia, Giuseppe; Damiano, Rocco

2016-04-01

To assess the performance of prostate health index (PHI) and prostate cancer antigen 3 (PCA3) when added to the PRIAS or Epstein criteria in predicting the presence of pathologically insignificant prostate cancer (IPCa) in patients who underwent radical prostatectomy (RP) but eligible for active surveillance (AS). An observational retrospective study was performed in 188 PCa patients treated with laparoscopic or robot-assisted RP but eligible for AS according to Epstein or PRIAS criteria. Blood and urinary specimens were collected before initial prostate biopsy for PHI and PCA3 measurements. Multivariate logistic regression analyses and decision curve analysis were carried out to identify predictors of IPCa using the updated ERSPC definition. At the multivariate analyses, the inclusion of both PCA3 and PHI significantly increased the accuracy of the Epstein multivariate model in predicting IPCa with an increase of 17 % (AUC = 0.77) and of 32 % (AUC = 0.92), respectively. The inclusion of both PCA3 and PHI also increased the predictive accuracy of the PRIAS multivariate model with an increase of 29 % (AUC = 0.87) and of 39 % (AUC = 0.97), respectively. DCA revealed that the multivariable models with the addition of PHI or PCA3 showed a greater net benefit and performed better than the reference models. In a direct comparison, PHI outperformed PCA3 performance resulting in higher net benefit. In a same cohort of patients eligible for AS, the addition of PHI and PCA3 to Epstein or PRIAS models improved their prognostic performance. PHI resulted in greater net benefit in predicting IPCa compared to PCA3.

Probabilistic flood damage modelling at the meso-scale

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

2014-05-01

Decisions on flood risk management and adaptation are usually based on risk analyses. Such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments. Most damage models have in common that complex damaging processes are described by simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood damage models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we show how the model BT-FLEMO (Bagging decision Tree based Flood Loss Estimation MOdel) can be applied on the meso-scale, namely on the basis of ATKIS land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany. The application of BT-FLEMO provides a probability distribution of estimated damage to residential buildings per municipality. Validation is undertaken on the one hand via a comparison with eight other damage models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official damage data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of damage estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation model BT-FLEMO is that it inherently provides quantitative information about the uncertainty of the prediction. Reference: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64.
iVAR: a program for imputing missing data in multivariate time series using vector autoregressive models.

PubMed

Liu, Siwei; Molenaar, Peter C M

2014-12-01

This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. We conducted a simulation study to compare iVAR with three methods for handling missing data: listwise deletion, imputation with sample means and variances, and multiple imputation ignoring time dependency. The results showed that iVAR produces better estimates for the cross-lagged coefficients than do the other three methods. We demonstrate the use of iVAR with an empirical example of time series electrodermal activity data and discuss the advantages and limitations of the program.
Centralized PI control for high dimensional multivariable systems based on equivalent transfer function.

PubMed

Luan, Xiaoli; Chen, Qiang; Liu, Fei

2014-09-01

This article presents a new scheme to design full matrix controller for high dimensional multivariable processes based on equivalent transfer function (ETF). Differing from existing ETF method, the proposed ETF is derived directly by exploiting the relationship between the equivalent closed-loop transfer function and the inverse of open-loop transfer function. Based on the obtained ETF, the full matrix controller is designed utilizing the existing PI tuning rules. The new proposed ETF model can more accurately represent the original processes. Furthermore, the full matrix centralized controller design method proposed in this paper is applicable to high dimensional multivariable systems with satisfactory performance. Comparison with other multivariable controllers shows that the designed ETF based controller is superior with respect to design-complexity and obtained performance. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Interpretability of Multivariate Brain Maps in Linear Brain Decoding: Definition, and Heuristic Quantification in Multivariate Analysis of MEG Time-Locked Effects.

PubMed

Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea

2016-01-01

Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Interpretability of Multivariate Brain Maps in Linear Brain Decoding: Definition, and Heuristic Quantification in Multivariate Analysis of MEG Time-Locked Effects

PubMed Central

Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea

2017-01-01

Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. PMID:28167896
Linkage Analysis of a Model Quantitative Trait in Humans: Finger Ridge Count Shows Significant Multivariate Linkage to 5q14.1

PubMed Central

Medland, Sarah E; Loesch, Danuta Z; Mdzewski, Bogdan; Zhu, Gu; Montgomery, Grant W; Martin, Nicholas G

2007-01-01

The finger ridge count (a measure of pattern size) is one of the most heritable complex traits studied in humans and has been considered a model human polygenic trait in quantitative genetic analysis. Here, we report the results of the first genome-wide linkage scan for finger ridge count in a sample of 2,114 offspring from 922 nuclear families. Both univariate linkage to the absolute ridge count (a sum of all the ridge counts on all ten fingers), and multivariate linkage analyses of the counts on individual fingers, were conducted. The multivariate analyses yielded significant linkage to 5q14.1 (Logarithm of odds [LOD] = 3.34, pointwise-empirical p-value = 0.00025) that was predominantly driven by linkage to the ring, index, and middle fingers. The strongest univariate linkage was to 1q42.2 (LOD = 2.04, point-wise p-value = 0.002, genome-wide p-value = 0.29). In summary, the combination of univariate and multivariate results was more informative than simple univariate analyses alone. Patterns of quantitative trait loci factor loadings consistent with developmental fields were observed, and the simple pleiotropic model underlying the absolute ridge count was not sufficient to characterize the interrelationships between the ridge counts of individual fingers. PMID:17907812
Analysis of multivariate longitudinal kidney function outcomes using generalized linear mixed models.

PubMed

Jaffa, Miran A; Gebregziabher, Mulugeta; Jaffa, Ayad A

2015-06-14

Renal transplant patients are mandated to have continuous assessment of their kidney function over time to monitor disease progression determined by changes in blood urea nitrogen (BUN), serum creatinine (Cr), and estimated glomerular filtration rate (eGFR). Multivariate analysis of these outcomes that aims at identifying the differential factors that affect disease progression is of great clinical significance. Thus our study aims at demonstrating the application of different joint modeling approaches with random coefficients on a cohort of renal transplant patients and presenting a comparison of their performance through a pseudo-simulation study. The objective of this comparison is to identify the model with best performance and to determine whether accuracy compensates for complexity in the different multivariate joint models. We propose a novel application of multivariate Generalized Linear Mixed Models (mGLMM) to analyze multiple longitudinal kidney function outcomes collected over 3 years on a cohort of 110 renal transplantation patients. The correlated outcomes BUN, Cr, and eGFR and the effect of various covariates such patient's gender, age and race on these markers was determined holistically using different mGLMMs. The performance of the various mGLMMs that encompass shared random intercept (SHRI), shared random intercept and slope (SHRIS), separate random intercept (SPRI) and separate random intercept and slope (SPRIS) was assessed to identify the one that has the best fit and most accurate estimates. A bootstrap pseudo-simulation study was conducted to gauge the tradeoff between the complexity and accuracy of the models. Accuracy was determined using two measures; the mean of the differences between the estimates of the bootstrapped datasets and the true beta obtained from the application of each model on the renal dataset, and the mean of the square of these differences. The results showed that SPRI provided most accurate estimates and did not exhibit any computational or convergence problem. Higher accuracy was demonstrated when the level of complexity increased from shared random coefficient models to the separate random coefficient alternatives with SPRI showing to have the best fit and most accurate estimates.
Multivariate methods for evaluating the efficiency of electrodialytic removal of heavy metals from polluted harbour sediments.

PubMed

Pedersen, Kristine Bondo; Kirkelund, Gunvor M; Ottosen, Lisbeth M; Jensen, Pernille E; Lejon, Tore

2015-01-01

Chemometrics was used to develop a multivariate model based on 46 previously reported electrodialytic remediation experiments (EDR) of five different harbour sediments. The model predicted final concentrations of Cd, Cu, Pb and Zn as a function of current density, remediation time, stirring rate, dry/wet sediment, cell set-up as well as sediment properties. Evaluation of the model showed that remediation time and current density had the highest comparative influence on the clean-up levels. Individual models for each heavy metal showed variance in the variable importance, indicating that the targeted heavy metals were bound to different sediment fractions. Based on the results, a PLS model was used to design five new EDR experiments of a sixth sediment to achieve specified clean-up levels of Cu and Pb. The removal efficiencies were up to 82% for Cu and 87% for Pb and the targeted clean-up levels were met in four out of five experiments. The clean-up levels were better than predicted by the model, which could hence be used for predicting an approximate remediation strategy; the modelling power will however improve with more data included. Copyright © 2014 Elsevier B.V. All rights reserved.
Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design

PubMed Central

Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo

2016-01-01

Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260
A multivariate time series approach to modeling and forecasting demand in the emergency department.

PubMed

Jones, Spencer S; Evans, R Scott; Allen, Todd L; Thomas, Alun; Haug, Peter J; Welch, Shari J; Snow, Gregory L

2009-02-01

The goals of this investigation were to study the temporal relationships between the demands for key resources in the emergency department (ED) and the inpatient hospital, and to develop multivariate forecasting models. Hourly data were collected from three diverse hospitals for the year 2006. Descriptive analysis and model fitting were carried out using graphical and multivariate time series methods. Multivariate models were compared to a univariate benchmark model in terms of their ability to provide out-of-sample forecasts of ED census and the demands for diagnostic resources. Descriptive analyses revealed little temporal interaction between the demand for inpatient resources and the demand for ED resources at the facilities considered. Multivariate models provided more accurate forecasts of ED census and of the demands for diagnostic resources. Our results suggest that multivariate time series models can be used to reliably forecast ED patient census; however, forecasts of the demands for diagnostic resources were not sufficiently reliable to be useful in the clinical setting.
Comparison of Multidimensional Item Response Models: Multivariate Normal Ability Distributions versus Multivariate Polytomous Ability Distributions. Research Report. ETS RR-08-45

ERIC Educational Resources Information Center

Haberman, Shelby J.; von Davier, Matthias; Lee, Yi-Hsuan

2008-01-01

Multidimensional item response models can be based on multivariate normal ability distributions or on multivariate polytomous ability distributions. For the case of simple structure in which each item corresponds to a unique dimension of the ability vector, some applications of the two-parameter logistic model to empirical data are employed to…
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Intharathirat, Rotchana, E-mail: rotchana.in@gmail.com; Abdul Salam, P., E-mail: salam@ait.ac.th; Kumar, S., E-mail: kumar@ait.ac.th

Highlights: • Grey model can be used to forecast MSW quantity accurately with the limited data. • Prediction interval overcomes the uncertainty of MSW forecast effectively. • A multivariate model gives accuracy associated with factors affecting MSW quantity. • Population, urbanization, employment and household size play role for MSW quantity. - Abstract: In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developingmore » countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435–44,994 tonnes per day in 2013 to 55,177–56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period.« less
The intervals method: a new approach to analyse finite element outputs using multivariate statistics

PubMed Central

De Esteban-Trivigno, Soledad; Püschel, Thomas A.; Fortuny, Josep

2017-01-01

Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches. PMID:29043107
Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

NASA Astrophysics Data System (ADS)

Pradhan, Biswajeet

2010-05-01

This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Stochastic modelling of temperatures affecting the in situ performance of a solar-assisted heat pump: The multivariate approach and physical interpretation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Loveday, D.L.; Craggs, C.

Box-Jenkins-based multivariate stochastic modeling is carried out using data recorded from a domestic heating system. The system comprises an air-source heat pump sited in the roof space of a house, solar assistance being provided by the conventional tile roof acting as a radiation absorber. Multivariate models are presented which illustrate the time-dependent relationships between three air temperatures - at external ambient, at entry to, and at exit from, the heat pump evaporator. Using a deterministic modeling approach, physical interpretations are placed on the results of the multivariate technique. It is concluded that the multivariate Box-Jenkins approach is a suitable techniquemore » for building thermal analysis. Application to multivariate Box-Jenkins approach is a suitable technique for building thermal analysis. Application to multivariate model-based control is discussed, with particular reference to building energy management systems. It is further concluded that stochastic modeling of data drawn from a short monitoring period offers a means of retrofitting an advanced model-based control system in existing buildings, which could be used to optimize energy savings. An approach to system simulation is suggested.« less
Multivariate statistical model for 3D image segmentation with application to medical images.

PubMed

John, Nigel M; Kabuka, Mansur R; Ibrahim, Mohamed O

2003-12-01

In this article we describe a statistical model that was developed to segment brain magnetic resonance images. The statistical segmentation algorithm was applied after a pre-processing stage involving the use of a 3D anisotropic filter along with histogram equalization techniques. The segmentation algorithm makes use of prior knowledge and a probability-based multivariate model designed to semi-automate the process of segmentation. The algorithm was applied to images obtained from the Center for Morphometric Analysis at Massachusetts General Hospital as part of the Internet Brain Segmentation Repository (IBSR). The developed algorithm showed improved accuracy over the k-means, adaptive Maximum Apriori Probability (MAP), biased MAP, and other algorithms. Experimental results showing the segmentation and the results of comparisons with other algorithms are provided. Results are based on an overlap criterion against expertly segmented images from the IBSR. The algorithm produced average results of approximately 80% overlap with the expertly segmented images (compared with 85% for manual segmentation and 55% for other algorithms).
FBST for Cointegration Problems

NASA Astrophysics Data System (ADS)

Diniz, M.; Pereira, C. A. B.; Stern, J. M.

2008-11-01

In order to estimate causal relations, the time series econometrics has to be aware of spurious correlation, a problem first mentioned by Yule [21]. To solve the problem, one can work with differenced series or use multivariate models like VAR or VEC models. In this case, the analysed series are going to present a long run relation i.e. a cointegration relation. Even though the Bayesian literature about inference on VAR/VEC models is quite advanced, Bauwens et al. [2] highlight that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results." This paper presents the Full Bayesian Significance Test applied to cointegration rank selection tests in multivariate (VAR/VEC) time series models and shows how to implement it using available in the literature and simulated data sets. A standard non-informative prior is assumed.
A Penalized Likelihood Framework For High-Dimensional Phylogenetic Comparative Methods And An Application To New-World Monkeys Brain Evolution.

PubMed

Julien, Clavel; Leandro, Aristide; Hélène, Morlon

2018-06-19

Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
A refined method for multivariate meta-analysis and meta-regression.

PubMed

Jackson, Daniel; Riley, Richard D

2014-02-20

Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
Characterizing multivariate decoding models based on correlated EEG spectral features

PubMed Central

McFarland, Dennis J.

2013-01-01

Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267

Neuroanatomical morphometric characterization of sex differences in youth using statistical learning.

PubMed

Sepehrband, Farshid; Lynch, Kirsten M; Cabeen, Ryan P; Gonzalez-Zacarias, Clio; Zhao, Lu; D'Arcy, Mike; Kesselman, Carl; Herting, Megan M; Dinov, Ivo D; Toga, Arthur W; Clark, Kristi A

2018-05-15

Exploring neuroanatomical sex differences using a multivariate statistical learning approach can yield insights that cannot be derived with univariate analysis. While gross differences in total brain volume are well-established, uncovering the more subtle, regional sex-related differences in neuroanatomy requires a multivariate approach that can accurately model spatial complexity as well as the interactions between neuroanatomical features. Here, we developed a multivariate statistical learning model using a support vector machine (SVM) classifier to predict sex from MRI-derived regional neuroanatomical features from a single-site study of 967 healthy youth from the Philadelphia Neurodevelopmental Cohort (PNC). Then, we validated the multivariate model on an independent dataset of 682 healthy youth from the multi-site Pediatric Imaging, Neurocognition and Genetics (PING) cohort study. The trained model exhibited an 83% cross-validated prediction accuracy, and correctly predicted the sex of 77% of the subjects from the independent multi-site dataset. Results showed that cortical thickness of the middle occipital lobes and the angular gyri are major predictors of sex. Results also demonstrated the inferential benefits of going beyond classical regression approaches to capture the interactions among brain features in order to better characterize sex differences in male and female youths. We also identified specific cortical morphological measures and parcellation techniques, such as cortical thickness as derived from the Destrieux atlas, that are better able to discriminate between males and females in comparison to other brain atlases (Desikan-Killiany, Brodmann and subcortical atlases). Copyright © 2018 Elsevier Inc. All rights reserved.
A revision of chiggers of the minuta species-group (Acari: Trombiculidae: Neotrombicula Hirst, 1925) using multivariate morphometrics.

PubMed

Stekolnikov, Alexandr A; Klimov, Pavel B

2010-09-01

We revise chiggers belonging to the minuta-species group (genus Neotrombicula Hirst, 1925) from the Palaearctic using size-free multivariate morphometrics. This approach allowed us to resolve several diagnostic problems. We show that the widely distributed Neotrombicula scrupulosa Kudryashova, 1993 forms three spatially and ecologically isolated groups different from each other in size or shape (morphometric property) only: specimens from the Caucasus are distinct from those from Asia in shape, whereas the Asian specimens from plains and mountains are different from each other in size. We developed a multivariate classification model to separate three closely related species: N. scrupulosa, N. lubrica Kudryashova, 1993 and N. minuta Schluger, 1966. This model is based on five shape variables selected from an initial 17 variables by a best subset analysis using a custom size-correction subroutine. The variable selection procedure slightly improved the predictive power of the model, suggesting that it not only removed redundancy but also reduced 'noise' in the dataset. The overall classification accuracy of this model is 96.2, 96.2 and 95.5%, as estimated by internal validation, external validation and jackknife statistics, respectively. Our analyses resulted in one new synonymy: N. dimidiata Stekolnikov, 1995 is considered to be a synonym of N. lubrica. Both N. scrupulosa and N. lubrica are recorded from new localities. A key to species of the minuta-group incorporating results from our multivariate analyses is presented.
TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies

PubMed Central

van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.

2013-01-01

To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524
Multivariate Longitudinal Analysis with Bivariate Correlation Test.

PubMed

Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

2016-01-01

In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Pain, pain intensity and pain disability in high school students are differently associated with physical activity, screening hours and sleep.

PubMed

Silva, Anabela G; Sa-Couto, Pedro; Queirós, Alexandra; Neto, Maritza; Rocha, Nelson P

2017-05-16

Studies exploring the association between physical activity, screen time and sleep and pain usually focus on a limited number of painful body sites. Nevertheless, pain at different body sites is likely to be of different nature. Therefore, this study aims to explore and compare the association between time spent in self-reported physical activity, in screen based activities and sleeping and i) pain presence in the last 7-days for 9 different body sites; ii) pain intensity at 9 different body sites and iii) global disability. Nine hundred sixty nine students completed a questionnaire on pain, time spent in moderate and vigorous physical activity, screen based time watching TV/DVD, playing, using mobile phones and computers and sleeping hours. Univariate and multivariate associations between pain presence, pain intensity and disability and physical activity, screen based time and sleeping hours were investigated. Pain presence: sleeping remained in the multivariable model for the neck, mid back, wrists, knees and ankles/feet (OR 1.17 to 2.11); moderate physical activity remained in the multivariate model for the neck, shoulders, wrists, hips and ankles/feet (OR 1.06 to 1.08); vigorous physical activity remained in the multivariate model for mid back, knees and ankles/feet (OR 1.05 to 1.09) and screen time remained in the multivariate model for the low back (OR = 2.34. Pain intensity: screen time and moderate physical activity remained in the multivariable model for pain intensity at the neck, mid back, low back, shoulder, knees and ankles/feet (Rp 2 0.02 to 0.04) and at the wrists (Rp 2 = 0.04), respectively. Disability showed no association with sleeping, screen time or physical activity. This study suggests both similarities and differences in the patterns of association between time spent in physical activity, sleeping and in screen based activities and pain presence at 8 different body sites. In addition, they also suggest that the factors associated with the presence of pain, pain intensity and pain associated disability are different.
A multivariable model for predicting the frictional behaviour and hydration of the human skin.

PubMed

Veijgen, N K; van der Heide, E; Masen, M A

2013-08-01

The frictional characteristics of skin-object interactions are important when handling objects, in the assessment of perception and comfort of products and materials and in the origins and prevention of skin injuries. In this study, based on statistical methods, a quantitative model is developed that describes the friction behaviour of human skin as a function of the subject characteristics, contact conditions, the properties of the counter material as well as environmental conditions. Although the frictional behaviour of human skin is a multivariable problem, in literature the variables that are associated with skin friction have been studied using univariable methods. In this work, multivariable models for the static and dynamic coefficients of friction as well as for the hydration of the skin are presented. A total of 634 skin-friction measurements were performed using a recently developed tribometer. Using a statistical analysis, previously defined potential influential variables were linked to the static and dynamic coefficient of friction and to the hydration of the skin, resulting in three predictive quantitative models that descibe the friction behaviour and the hydration of human skin respectively. Increased dynamic coefficients of friction were obtained from older subjects, on the index finger, with materials with a higher surface energy at higher room temperatures, whereas lower dynamic coefficients of friction were obtained at lower skin temperatures, on the temple with rougher contact materials. The static coefficient of friction increased with higher skin hydration, increasing age, on the index finger, with materials with a higher surface energy and at higher ambient temperatures. The hydration of the skin was associated with the skin temperature, anatomical location, presence of hair on the skin and the relative air humidity. Predictive models have been derived for the static and dynamic coefficient of friction using a multivariable approach. These two coefficients of friction show a strong correlation. Consequently the two multivariable models resemble, with the static coefficient of friction being on average 18% lower than the dynamic coefficient of friction. The multivariable models in this study can be used to describe the data set that was the basis for this study. Care should be taken when generalising these results. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
A New Approach of Juvenile Age Estimation using Measurements of the Ilium and Multivariate Adaptive Regression Splines (MARS) Models for Better Age Prediction.

PubMed

Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal

2017-01-01

Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
Parametric Cost Models for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Henrichs, Todd; Dollinger, Courtney

2010-01-01

Multivariable parametric cost models for space telescopes provide several benefits to designers and space system project managers. They identify major architectural cost drivers and allow high-level design trades. They enable cost-benefit analysis for technology development investment. And, they provide a basis for estimating total project cost. A survey of historical models found that there is no definitive space telescope cost model. In fact, published models vary greatly [1]. Thus, there is a need for parametric space telescopes cost models. An effort is underway to develop single variable [2] and multi-variable [3] parametric space telescope cost models based on the latest available data and applying rigorous analytical techniques. Specific cost estimating relationships (CERs) have been developed which show that aperture diameter is the primary cost driver for large space telescopes; technology development as a function of time reduces cost at the rate of 50% per 17 years; it costs less per square meter of collecting aperture to build a large telescope than a small telescope; and increasing mass reduces cost.
Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design.

PubMed

Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo

2017-03-15

Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Probability of detecting atrazine/desethyl-atrazine and elevated concentrations of nitrate in ground water in Colorado

USGS Publications Warehouse

Rupert, Michael G.

2003-01-01

Draft Federal regulations may require that each State develop a State Pesticide Management Plan for the herbicides atrazine, alachlor, metolachlor, and simazine. Maps were developed that the State of Colorado could use to predict the probability of detecting atrazine and desethyl-atrazine (a breakdown product of atrazine) in ground water in Colorado. These maps can be incorporated into the State Pesticide Management Plan and can help provide a sound hydrogeologic basis for atrazine management in Colorado. Maps showing the probability of detecting elevated nitrite plus nitrate as nitrogen (nitrate) concentrations in ground water in Colorado also were developed because nitrate is a contaminant of concern in many areas of Colorado. Maps showing the probability of detecting atrazine and(or) desethyl-atrazine (atrazine/DEA) at or greater than concentrations of 0.1 microgram per liter and nitrate concentrations in ground water greater than 5 milligrams per liter were developed as follows: (1) Ground-water quality data were overlaid with anthropogenic and hydrogeologic data using a geographic information system to produce a data set in which each well had corresponding data on atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well construction. These data then were downloaded to a statistical software package for analysis by logistic regression. (2) Relations were observed between ground-water quality and the percentage of land-cover categories within circular regions (buffers) around wells. Several buffer sizes were evaluated; the buffer size that provided the strongest relation was selected for use in the logistic regression models. (3) Relations between concentrations of atrazine/DEA and nitrate in ground water and atrazine use, fertilizer use, geology, hydrogeomorphic regions, land cover, precipitation, soils, and well-construction data were evaluated, and several preliminary multivariate models with various combinations of independent variables were constructed. (4) The multivariate models that best predicted the presence of atrazine/DEA and elevated concentrations of nitrate in ground water were selected. (5) The accuracy of the multivariate models was confirmed by validating the models with an independent set of ground-water quality data. (6) The multivariate models were entered into a geographic information system and the probability maps were constructed.
Enhancing e-waste estimates: Improving data quality by multivariate Input–Output Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Feng, E-mail: fwang@unu.edu; Design for Sustainability Lab, Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, 2628CE Delft; Huisman, Jaco

2013-11-15

Highlights: • A multivariate Input–Output Analysis method for e-waste estimates is proposed. • Applying multivariate analysis to consolidate data can enhance e-waste estimates. • We examine the influence of model selection and data quality on e-waste estimates. • Datasets of all e-waste related variables in a Dutch case study have been provided. • Accurate modeling of time-variant lifespan distributions is critical for estimate. - Abstract: Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lackmore » of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.« less
Modeling rainfall-runoff relationship using multivariate GARCH model

NASA Astrophysics Data System (ADS)

Modarres, R.; Ouarda, T. B. M. J.

2013-08-01

The traditional hydrologic time series approaches are used for modeling, simulating and forecasting conditional mean of hydrologic variables but neglect their time varying variance or the second order moment. This paper introduces the multivariate Generalized Autoregressive Conditional Heteroscedasticity (MGARCH) modeling approach to show how the variance-covariance relationship between hydrologic variables varies in time. These approaches are also useful to estimate the dynamic conditional correlation between hydrologic variables. To illustrate the novelty and usefulness of MGARCH models in hydrology, two major types of MGARCH models, the bivariate diagonal VECH and constant conditional correlation (CCC) models are applied to show the variance-covariance structure and cdynamic correlation in a rainfall-runoff process. The bivariate diagonal VECH-GARCH(1,1) and CCC-GARCH(1,1) models indicated both short-run and long-run persistency in the conditional variance-covariance matrix of the rainfall-runoff process. The conditional variance of rainfall appears to have a stronger persistency, especially long-run persistency, than the conditional variance of streamflow which shows a short-lived drastic increasing pattern and a stronger short-run persistency. The conditional covariance and conditional correlation coefficients have different features for each bivariate rainfall-runoff process with different degrees of stationarity and dynamic nonlinearity. The spatial and temporal pattern of variance-covariance features may reflect the signature of different physical and hydrological variables such as drainage area, topography, soil moisture and ground water fluctuations on the strength, stationarity and nonlinearity of the conditional variance-covariance for a rainfall-runoff process.
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models

ERIC Educational Resources Information Center

Price, Larry R.

2012-01-01

The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Characterizing multivariate decoding models based on correlated EEG spectral features.

PubMed

McFarland, Dennis J

2013-07-01

Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.

PubMed

Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin

2015-04-01

Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Academic Misconduct among Nursing Students: A Multivariate Investigation.

ERIC Educational Resources Information Center

Daniel, Larry G.; And Others

1994-01-01

Using Maslow's Need-Goal Motivation Model, data from 190 nursing students showed moderately high correlation between perceptions of peers' maturity, commitment, and neutralizing attitude and perceptions of peers' engagement in academic misconduct. Neutralization (rationalizing behavior) was the strongest predictor. (SK)
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models.

PubMed

Intharathirat, Rotchana; Abdul Salam, P; Kumar, S; Untong, Akarapong

2015-05-01

In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developing countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435-44,994 tonnes per day in 2013 to 55,177-56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period. Copyright © 2015 Elsevier Ltd. All rights reserved.
Mortality Prediction Model of Septic Shock Patients Based on Routinely Recorded Data

PubMed Central

Carrara, Marta; Baselli, Giuseppe; Ferrario, Manuela

2015-01-01

We studied the problem of mortality prediction in two datasets, the first composed of 23 septic shock patients and the second composed of 73 septic subjects selected from the public database MIMIC-II. For each patient we derived hemodynamic variables, laboratory results, and clinical information of the first 48 hours after shock onset and we performed univariate and multivariate analyses to predict mortality in the following 7 days. The results show interesting features that individually identify significant differences between survivors and nonsurvivors and features which gain importance only when considered together with the others in a multivariate regression model. This preliminary study on two small septic shock populations represents a novel contribution towards new personalized models for an integration of multiparameter patient information to improve critical care management of shock patients. PMID:26557154
Thermal signature identification system (TheSIS): a spread spectrum temperature cycling method

NASA Astrophysics Data System (ADS)

Merritt, Scott

2015-03-01

NASA GSFC's Thermal Signature Identification System (TheSIS) 1) measures the high order dynamic responses of optoelectronic components to direct sequence spread-spectrum temperature cycling, 2) estimates the parameters of multiple autoregressive moving average (ARMA) or other models the of the responses, 3) and selects the most appropriate model using the Akaike Information Criterion (AIC). Using the AIC-tested model and parameter vectors from TheSIS, one can 1) select high-performing components on a multivariate basis, i.e., with multivariate Figures of Merit (FOMs), 2) detect subtle reversible shifts in performance, and 3) investigate irreversible changes in component or subsystem performance, e.g. aging. We show examples of the TheSIS methodology for passive and active components and systems, e.g. fiber Bragg gratings (FBGs) and DFB lasers with coupled temperature control loops, respectively.
Remote sensing estimation of the total phosphorus concentration in a large lake using band combinations and regional multivariate statistical modeling techniques.

PubMed

Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi

2015-03-15

Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate statistical modeling techniques, demonstrated advantages for estimating the TP concentration in a large lake and had a strong potential for universal application for the TP concentration estimation in large lake waters worldwide. Copyright © 2014 Elsevier Ltd. All rights reserved.

Probabilistic, meso-scale flood loss modelling

NASA Astrophysics Data System (ADS)

Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno

2016-04-01

Flood risk analyses are an important basis for decisions on flood risk management and adaptation. However, such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments and even more for flood loss modelling. State of the art in flood loss modelling is still the use of simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood loss models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we demonstrate and evaluate the upscaling of the approach to the meso-scale, namely on the basis of land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany (Botto et al. submitted). The application of bagging decision tree based loss models provide a probability distribution of estimated loss per municipality. Validation is undertaken on the one hand via a comparison with eight deterministic loss models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official loss data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of loss estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation approach is that it inherently provides quantitative information about the uncertainty of the prediction. References: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64. Botto A, Kreibich H, Merz B, Schröter K (submitted) Probabilistic, multi-variable flood loss modelling on the meso-scale with BT-FLEMO. Risk Analysis.
Mathematical models for exploring different aspects of genotoxicity and carcinogenicity databases.

PubMed

Benigni, R; Giuliani, A

1991-12-01

One great obstacle to understanding and using the information contained in the genotoxicity and carcinogenicity databases is the very size of such databases. Their vastness makes them difficult to read; this leads to inadequate exploitation of the information, which becomes costly in terms of time, labor, and money. In its search for adequate approaches to the problem, the scientific community has, curiously, almost entirely neglected an existent series of very powerful methods of data analysis: the multivariate data analysis techniques. These methods were specifically designed for exploring large data sets. This paper presents the multivariate techniques and reports a number of applications to genotoxicity problems. These studies show how biology and mathematical modeling can be combined and how successful this combination is.
Retention of community college students in online courses

NASA Astrophysics Data System (ADS)

Krajewski, Sarah

The issue of attrition in online courses at higher learning institutions remains a high priority in the United States. A recent rapid growth of online courses at community colleges has been instigated by student demand, as they meet the time constraints many nontraditional community college students have as a result of the need to work and care for dependents. Failure in an online course can cause students to become frustrated with the college experience, financially burdened, or to even give up and leave college. Attrition could be avoided by proper guidance of who is best suited for online courses. This study examined factors related to retention (i.e., course completion) and success (i.e., receiving a C or better) in an online biology course at a community college in the Midwest by operationalizing student characteristics (age, race, gender), student skills (whether or not the student met the criteria to be placed in an AFP course), and external factors (Pell recipient, full/part time status, first term) from the persistence model developed by Rovai. Internal factors from this model were not included in this study. Both univariate analyses and multivariate logistic regression were used to analyze the variables. Results suggest that race and Pell recipient were both predictive of course completion on univariate analyses. However, multivariate analyses showed that age, race, academic load and first term were predictive of completion and Pell recipient was no longer predictive. The univariate results for the C or better showed that age, race, Pell recipient, academic load, and meeting AFP criteria were predictive of success. Multivariate analyses showed that only age, race, and Pell recipient were significant predictors of success. Both regression models explained very little (<15%) of the variability within the outcome variables of retention and success. Therefore, although significant predictors were identified for course completion and retention, there are still many factors that remain unaccounted for in both regression models. Further research into the operationalization of Rovai's model, including internal factors, to predict completion and success is necessary.
Multivariate Models of Parent-Late Adolescent Gender Dyads: The Importance of Parenting Processes in Predicting Adjustment

ERIC Educational Resources Information Center

McKinney, Cliff; Renk, Kimberly

2008-01-01

Although parent-adolescent interactions have been examined, relevant variables have not been integrated into a multivariate model. As a result, this study examined a multivariate model of parent-late adolescent gender dyads in an attempt to capture important predictors in late adolescents' important and unique transition to adulthood. The sample…
Gene set analysis using variance component tests.

PubMed

Huang, Yen-Tsung; Lin, Xihong

2013-06-28

Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
Predicting microbiologically defined infection in febrile neutropenic episodes in children: global individual participant data multivariable meta-analysis

PubMed Central

Phillips, Robert S; Sung, Lillian; Amman, Roland A; Riley, Richard D; Castagnola, Elio; Haeusler, Gabrielle M; Klaassen, Robert; Tissing, Wim J E; Lehrnbecher, Thomas; Chisholm, Julia; Hakim, Hana; Ranasinghe, Neil; Paesmans, Marianne; Hann, Ian M; Stewart, Lesley A

2016-01-01

Background: Risk-stratified management of fever with neutropenia (FN), allows intensive management of high-risk cases and early discharge of low-risk cases. No single, internationally validated, prediction model of the risk of adverse outcomes exists for children and young people. An individual patient data (IPD) meta-analysis was undertaken to devise one. Methods: The ‘Predicting Infectious Complications in Children with Cancer' (PICNICC) collaboration was formed by parent representatives, international clinical and methodological experts. Univariable and multivariable analyses, using random effects logistic regression, were undertaken to derive and internally validate a risk-prediction model for outcomes of episodes of FN based on clinical and laboratory data at presentation. Results: Data came from 22 different study groups from 15 countries, of 5127 episodes of FN in 3504 patients. There were 1070 episodes in 616 patients from seven studies available for multivariable analysis. Univariable analyses showed associations with microbiologically defined infection (MDI) in many items, including higher temperature, lower white cell counts and acute myeloid leukaemia, but not age. Patients with osteosarcoma/Ewings sarcoma and those with more severe mucositis were associated with a decreased risk of MDI. The predictive model included: malignancy type, temperature, clinically ‘severely unwell', haemoglobin, white cell count and absolute monocyte count. It showed moderate discrimination (AUROC 0.723, 95% confidence interval 0.711–0.759) and good calibration (calibration slope 0.95). The model was robust to bootstrap and cross-validation sensitivity analyses. Conclusions: This new prediction model for risk of MDI appears accurate. It requires prospective studies assessing implementation to assist clinicians and parents/patients in individualised decision making. PMID:26954719
A multivariate model and statistical method for validating tree grade lumber yield equations

Treesearch

Donald W. Seegrist

1975-01-01

Lumber yields within lumber grades can be described by a multivariate linear model. A method for validating lumber yield prediction equations when there are several tree grades is presented. The method is based on multivariate simultaneous test procedures.
Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data

PubMed Central

Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian

2015-01-01

In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213
FGWAS: Functional genome wide association analysis.

PubMed

Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu

2017-10-01

Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.
Factorial Design Based Multivariate Modeling and Optimization of Tunable Bioresponsive Arginine Grafted Poly(cystaminebis(acrylamide)-diaminohexane) Polymeric Matrix Based Nanocarriers.

PubMed

Yang, Rongbing; Nam, Kihoon; Kim, Sung Wan; Turkson, James; Zou, Ye; Zuo, Yi Y; Haware, Rahul V; Chougule, Mahavir B

2017-01-03

Desired characteristics of nanocarriers are crucial to explore its therapeutic potential. This investigation aimed to develop tunable bioresponsive newly synthesized unique arginine grafted poly(cystaminebis(acrylamide)-diaminohexane) [ABP] polymeric matrix based nanocarriers by using L9 Taguchi factorial design, desirability function, and multivariate method. The selected formulation and process parameters were ABP concentration, acetone concentration, the volume ratio of acetone to ABP solution, and drug concentration. The measured nanocarrier characteristics were particle size, polydispersity index, zeta potential, and percentage drug loading. Experimental validation of nanocarrier characteristics computed from initially developed predictive model showed nonsignificant differences (p > 0.05). The multivariate modeling based optimized cationic nanocarrier formulation of <100 nm loaded with hydrophilic acetaminophen was readapted for a hydrophobic etoposide loading without significant changes (p > 0.05) except for improved loading percentage. This is the first study focusing on ABP polymeric matrix based nanocarrier development. Nanocarrier particle size was stable in PBS 7.4 for 48 h. The increase of zeta potential at lower pH 6.4, compared to the physiological pH, showed possible endosomal escape capability. The glutathione triggered release at the physiological conditions indicated the competence of cytosolic targeting delivery of the loaded drug from bioresponsive nanocarriers. In conclusion, this unique systematic approach provides rational evaluation and prediction of a tunable bioresponsive ABP based matrix nanocarrier, which was built on selected limited number of smart experimentation.
Tuning algorithms for fractional order internal model controllers for time delay processes

NASA Astrophysics Data System (ADS)

Muresan, Cristina I.; Dutta, Abhishek; Dulf, Eva H.; Pinar, Zehra; Maxim, Anca; Ionescu, Clara M.

2016-03-01

This paper presents two tuning algorithms for fractional-order internal model control (IMC) controllers for time delay processes. The two tuning algorithms are based on two specific closed-loop control configurations: the IMC control structure and the Smith predictor structure. In the latter, the equivalency between IMC and Smith predictor control structures is used to tune a fractional-order IMC controller as the primary controller of the Smith predictor structure. Fractional-order IMC controllers are designed in both cases in order to enhance the closed-loop performance and robustness of classical integer order IMC controllers. The tuning procedures are exemplified for both single-input-single-output as well as multivariable processes, described by first-order and second-order transfer functions with time delays. Different numerical examples are provided, including a general multivariable time delay process. Integer order IMC controllers are designed in each case, as well as fractional-order IMC controllers. The simulation results show that the proposed fractional-order IMC controller ensures an increased robustness to modelling uncertainties. Experimental results are also provided, for the design of a multivariable fractional-order IMC controller in a Smith predictor structure for a quadruple-tank system.
Predicting Outcomes After Chemo-Embolization in Patients with Advanced-Stage Hepatocellular Carcinoma: An Evaluation of Different Radiologic Response Criteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gunn, Andrew J., E-mail: agunn@uabmc.edu; Sheth, Rahul A.; Luber, Brandon

2017-01-15

PurposeThe purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC).Materials and methodsHospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which,more » if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP).Results75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6–24.8) and 9.8 months (95 % CI 7.1–21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51–0.58 in the univariate model; range: 0.54–0.58 in the multivariate model) or TTP (C-statistic range: 0.55–0.59 in the univariate model; range: 0.57–0.61 in the multivariate model).ConclusionCurrent response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.« less
Predicting Outcomes After Chemo-Embolization in Patients with Advanced-Stage Hepatocellular Carcinoma: An Evaluation of Different Radiologic Response Criteria.

PubMed

Gunn, Andrew J; Sheth, Rahul A; Luber, Brandon; Huynh, Minh-Huy; Rachamreddy, Niranjan R; Kalva, Sanjeeva P

2017-01-01

The purpse of this study was to evaluate the ability of various radiologic response criteria to predict patient outcomes after trans-arterial chemo-embolization with drug-eluting beads (DEB-TACE) in patients with advanced-stage (BCLC C) hepatocellular carcinoma (HCC). Hospital records from 2005 to 2011 were retrospectively reviewed. Non-infiltrative lesions were measured at baseline and on follow-up scans after DEB-TACE according to various common radiologic response criteria, including guidelines of the World Health Organization (WHO), Response Evaluation Criteria in Solid Tumors (RECIST), the European Association for the Study of the Liver (EASL), and modified RECIST (mRECIST). Statistical analysis was performed to see which, if any, of the response criteria could be used as a predictor of overall survival (OS) or time-to-progression (TTP). 75 patients met inclusion criteria. Median OS and TTP were 22.6 months (95 % CI 11.6-24.8) and 9.8 months (95 % CI 7.1-21.6), respectively. Univariate and multivariate Cox analyses revealed that none of the evaluated criteria had the ability to be used as a predictor for OS or TTP. Analysis of the C index in both univariate and multivariate models showed that the evaluated criteria were not accurate predictors of either OS (C-statistic range: 0.51-0.58 in the univariate model; range: 0.54-0.58 in the multivariate model) or TTP (C-statistic range: 0.55-0.59 in the univariate model; range: 0.57-0.61 in the multivariate model). Current response criteria are not accurate predictors of OS or TTP in patients with advanced-stage HCC after DEB-TACE.
Borrowing of strength and study weights in multivariate and network meta-analysis.

PubMed

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2017-12-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis

PubMed Central

Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D

2016-01-01

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
A refined method for multivariate meta-analysis and meta-regression

PubMed Central

Jackson, Daniel; Riley, Richard D

2014-01-01

Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

PubMed Central

Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

2017-01-01

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571
Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

PubMed

Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

2017-02-14

Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.
Estimating Risk of Natural Gas Portfolios by Using GARCH-EVT-Copula Model.

PubMed

Tang, Jiechen; Zhou, Chao; Yuan, Xinyu; Sriboonchitta, Songsak

2015-01-01

This paper concentrates on estimating the risk of Title Transfer Facility (TTF) Hub natural gas portfolios by using the GARCH-EVT-copula model. We first use the univariate ARMA-GARCH model to model each natural gas return series. Second, the extreme value distribution (EVT) is fitted to the tails of the residuals to model marginal residual distributions. Third, multivariate Gaussian copula and Student t-copula are employed to describe the natural gas portfolio risk dependence structure. Finally, we simulate N portfolios and estimate value at risk (VaR) and conditional value at risk (CVaR). Our empirical results show that, for an equally weighted portfolio of five natural gases, the VaR and CVaR values obtained from the Student t-copula are larger than those obtained from the Gaussian copula. Moreover, when minimizing the portfolio risk, the optimal natural gas portfolio weights are found to be similar across the multivariate Gaussian copula and Student t-copula and different confidence levels.
FT-IR/ATR univariate and multivariate calibration models for in situ monitoring of sugars in complex microalgal culture media.

PubMed

Girard, Jean-Michel; Deschênes, Jean-Sébastien; Tremblay, Réjean; Gagnon, Jonathan

2013-09-01

The objective of this work is to develop a quick and simple method for the in situ monitoring of sugars in biological cultures. A new technology based on Attenuated Total Reflectance-Fourier Transform Infrared (FT-IR/ATR) spectroscopy in combination with an external light guiding fiber probe was tested, first to build predictive models from solutions of pure sugars, and secondly to use those models to monitor the sugars in the complex culture medium of mixotrophic microalgae. Quantification results from the univariate model were correlated with the total dissolved solids content (R(2)=0.74). A vector normalized multivariate model was used to proportionally quantify the different sugars present in the complex culture medium and showed a predictive accuracy of >90% for sugars representing >20% of the total. This method offers an alternative to conventional sugar monitoring assays and could be used at-line or on-line in commercial scale production systems. Copyright © 2013 Elsevier Ltd. All rights reserved.

Multivariate Heteroscedasticity Models for Functional Brain Connectivity.

PubMed

Seiler, Christof; Holmes, Susan

2017-01-01

Functional brain connectivity is the co-occurrence of brain activity in different areas during resting and while doing tasks. The data of interest are multivariate timeseries measured simultaneously across brain parcels using resting-state fMRI (rfMRI). We analyze functional connectivity using two heteroscedasticity models. Our first model is low-dimensional and scales linearly in the number of brain parcels. Our second model scales quadratically. We apply both models to data from the Human Connectome Project (HCP) comparing connectivity between short and conventional sleepers. We find stronger functional connectivity in short than conventional sleepers in brain areas consistent with previous findings. This might be due to subjects falling asleep in the scanner. Consequently, we recommend the inclusion of average sleep duration as a covariate to remove unwanted variation in rfMRI studies. A power analysis using the HCP data shows that a sample size of 40 detects 50% of the connectivity at a false discovery rate of 20%. We provide implementations using R and the probabilistic programming language Stan.
Estimating Risk of Natural Gas Portfolios by Using GARCH-EVT-Copula Model

PubMed Central

Tang, Jiechen; Zhou, Chao; Yuan, Xinyu; Sriboonchitta, Songsak

2015-01-01

This paper concentrates on estimating the risk of Title Transfer Facility (TTF) Hub natural gas portfolios by using the GARCH-EVT-copula model. We first use the univariate ARMA-GARCH model to model each natural gas return series. Second, the extreme value distribution (EVT) is fitted to the tails of the residuals to model marginal residual distributions. Third, multivariate Gaussian copula and Student t-copula are employed to describe the natural gas portfolio risk dependence structure. Finally, we simulate N portfolios and estimate value at risk (VaR) and conditional value at risk (CVaR). Our empirical results show that, for an equally weighted portfolio of five natural gases, the VaR and CVaR values obtained from the Student t-copula are larger than those obtained from the Gaussian copula. Moreover, when minimizing the portfolio risk, the optimal natural gas portfolio weights are found to be similar across the multivariate Gaussian copula and Student t-copula and different confidence levels. PMID:26351652
Multivariate Longitudinal Analysis with Bivariate Correlation Test

PubMed Central

Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory

2016-01-01

In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692
Multivariable normal-tissue complication modeling of acute esophageal toxicity in advanced stage non-small cell lung cancer patients treated with intensity-modulated (chemo-)radiotherapy.

PubMed

Wijsman, Robin; Dankers, Frank; Troost, Esther G C; Hoffmann, Aswin L; van der Heijden, Erik H F M; de Geus-Oei, Lioe-Fee; Bussink, Johan

2015-10-01

The majority of normal-tissue complication probability (NTCP) models for acute esophageal toxicity (AET) in advanced stage non-small cell lung cancer (AS-NSCLC) patients treated with (chemo-)radiotherapy are based on three-dimensional conformal radiotherapy (3D-CRT). Due to distinct dosimetric characteristics of intensity-modulated radiation therapy (IMRT), 3D-CRT based models need revision. We established a multivariable NTCP model for AET in 149 AS-NSCLC patients undergoing IMRT. An established model selection procedure was used to develop an NTCP model for Grade ⩾2 AET (53 patients) including clinical and esophageal dose-volume histogram parameters. The NTCP model predicted an increased risk of Grade ⩾2 AET in case of: concurrent chemoradiotherapy (CCR) [adjusted odds ratio (OR) 14.08, 95% confidence interval (CI) 4.70-42.19; p<0.001], increasing mean esophageal dose [Dmean; OR 1.12 per Gy increase, 95% CI 1.06-1.19; p<0.001], female patients (OR 3.33, 95% CI 1.36-8.17; p=0.008), and ⩾cT3 (OR 2.7, 95% CI 1.12-6.50; p=0.026). The AUC was 0.82 and the model showed good calibration. A multivariable NTCP model including CCR, Dmean, clinical tumor stage and gender predicts Grade ⩾2 AET after IMRT for AS-NSCLC. Prior to clinical introduction, the model needs validation in an independent patient cohort. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis.

PubMed

Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong

2018-02-27

Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.
Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

PubMed Central

Ye, Lanhan; Song, Kunlin; Shen, Tingting

2018-01-01

Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445
Multivariate pattern dependence

PubMed Central

Saxe, Rebecca

2017-01-01

When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD): a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS) and to the fusiform face area (FFA), using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity. PMID:29155809
Improved estimation of PM2.5 using Lagrangian satellite-measured aerosol optical depth

NASA Astrophysics Data System (ADS)

Olivas Saunders, Rolando

Suspended particulate matter (aerosols) with aerodynamic diameters less than 2.5 mum (PM2.5) has negative effects on human health, plays an important role in climate change and also causes the corrosion of structures by acid deposition. Accurate estimates of PM2.5 concentrations are thus relevant in air quality, epidemiology, cloud microphysics and climate forcing studies. Aerosol optical depth (AOD) retrieved by the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument has been used as an empirical predictor to estimate ground-level concentrations of PM2.5 . These estimates usually have large uncertainties and errors. The main objective of this work is to assess the value of using upwind (Lagrangian) MODIS-AOD as predictors in empirical models of PM2.5. The upwind locations of the Lagrangian AOD were estimated using modeled backward air trajectories. Since the specification of an arrival elevation is somewhat arbitrary, trajectories were calculated to arrive at four different elevations at ten measurement sites within the continental United States. A systematic examination revealed trajectory model calculations to be sensitive to starting elevation. With a 500 m difference in starting elevation, the 48-hr mean horizontal separation of trajectory endpoints was 326 km. When the difference in starting elevation was doubled and tripled to 1000 m and 1500m, the mean horizontal separation of trajectory endpoints approximately doubled and tripled to 627 km and 886 km, respectively. A seasonal dependence of this sensitivity was also found: the smallest mean horizontal separation of trajectory endpoints was exhibited during the summer and the largest separations during the winter. A daily average AOD product was generated and coupled to the trajectory model in order to determine AOD values upwind of the measurement sites during the period 2003-2007. Empirical models that included in situ AOD and upwind AOD as predictors of PM2.5 were generated by multivariate linear regressions using the least squares method. The multivariate models showed improved performance over the single variable regression (PM2.5 and in situ AOD) models. The statistical significance of the improvement of the multivariate models over the single variable regression models was tested using the extra sum of squares principle. In many cases, even when the R-squared was high for the multivariate models, the improvement over the single models was not statistically significant. The R-squared of these multivariate models varied with respect to seasons, with the best performance occurring during the summer months. A set of seasonal categorical variables was included in the regressions to exploit this variability. The multivariate regression models that included these categorical seasonal variables performed better than the models that didn't account for seasonal variability. Furthermore, 71% of these regressions exhibited improvement over the single variable models that was statistically significant at a 95% confidence level.
Sex Hormones and Sleep in Men and Women From the General Population: A Cross-Sectional Observational Study.

PubMed

Kische, Hanna; Ewert, Ralf; Fietze, Ingo; Gross, Stefan; Wallaschofski, Henri; Völzke, Henry; Dörr, Marcus; Nauck, Matthias; Obst, Anne; Stubbe, Beate; Penzel, Thomas; Haring, Robin

2016-11-01

Associations between sex hormones and sleep habits originate mainly from small and selected patient-based samples. We examined data from a population-based sample with various sleep characteristics and the major part of sex hormones measured by mass spectrometry. We used data from 204 men and 213 women of the cross-sectional Study of Health in Pomerania-TREND. Associations of total T (TT) and free T, androstenedione (ASD), estrone, estradiol (E2), dehydroepiandrosterone-sulphate, SHBG, and E2 to TT ratio with sleep measures (including total sleep time, sleep efficiency, wake after sleep onset, apnea-hypopnea index [AHI], Insomnia Severity Index, Epworth Sleepiness Scale, and Pittsburgh Sleep Quality Index) were assessed by sex-specific multivariable regression models. In men, age-adjusted associations of TT (odds ratio 0.62; 95% confidence interval (CI) 0.46-0.83), free T, and SHBG with AHI were rendered nonsignificant after multivariable adjustment. In multivariable analyses, ASD was associated with Epworth Sleepiness Scale (β-coefficient per SD increase in ASD: -0.71; 95% CI: -1.18 to -0.25). In women, multivariable analyses showed positive associations of dehydroepiandrosterone-sulphate with wake after sleep onset (β-coefficient: .16; 95% CI 0.03-0.28) and of E2 and E2 to TT ratio with Epworth Sleepiness Scale. Additionally, free T and SHBG were associated with AHI in multivariable models among premenopausal women. The present cross-sectional, population-based study observed sex-specific associations of androgens, E2, and SHBG with sleep apnea and daytime sleepiness. However, multivariable-adjusted analyses confirmed the impact of body composition and health-related lifestyle on the association between sex hormones and sleep.
Path analysis and multi-criteria decision making: an approach for multivariate model selection and analysis in health.

PubMed

Vasconcelos, A G; Almeida, R M; Nobre, F F

2001-08-01

This paper introduces an approach that includes non-quantitative factors for the selection and assessment of multivariate complex models in health. A goodness-of-fit based methodology combined with fuzzy multi-criteria decision-making approach is proposed for model selection. Models were obtained using the Path Analysis (PA) methodology in order to explain the interrelationship between health determinants and the post-neonatal component of infant mortality in 59 municipalities of Brazil in the year 1991. Socioeconomic and demographic factors were used as exogenous variables, and environmental, health service and agglomeration as endogenous variables. Five PA models were developed and accepted by statistical criteria of goodness-of fit. These models were then submitted to a group of experts, seeking to characterize their preferences, according to predefined criteria that tried to evaluate model relevance and plausibility. Fuzzy set techniques were used to rank the alternative models according to the number of times a model was superior to ("dominated") the others. The best-ranked model explained above 90% of the endogenous variables variation, and showed the favorable influences of income and education levels on post-neonatal mortality. It also showed the unfavorable effect on mortality of fast population growth, through precarious dwelling conditions and decreased access to sanitation. It was possible to aggregate expert opinions in model evaluation. The proposed procedure for model selection allowed the inclusion of subjective information in a clear and systematic manner.
Risk factors for low receptive vocabulary abilities in the preschool and early school years in the longitudinal study of Australian children.

PubMed

Christensen, Daniel; Zubrick, Stephen R; Lawrence, David; Mitrou, Francis; Taylor, Catherine L

2014-01-01

Receptive vocabulary development is a component of the human language system that emerges in the first year of life and is characterised by onward expansion throughout life. Beginning in infancy, children's receptive vocabulary knowledge builds the foundation for oral language and reading skills. The foundations for success at school are built early, hence the public health policy focus on reducing developmental inequalities before children start formal school. The underlying assumption is that children's development is stable, and therefore predictable, over time. This study investigated this assumption in relation to children's receptive vocabulary ability. We investigated the extent to which low receptive vocabulary ability at 4 years was associated with low receptive vocabulary ability at 8 years, and the predictive utility of a multivariate model that included child, maternal and family risk factors measured at 4 years. The study sample comprised 3,847 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Multivariate logistic regression was used to investigate risks for low receptive vocabulary ability from 4-8 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. In the multivariate model, substantial risk factors for receptive vocabulary delay from 4-8 years, in order of descending magnitude, were low receptive vocabulary ability at 4 years, low maternal education, and low school readiness. Moderate risk factors, in order of descending magnitude, were low maternal parenting consistency, socio-economic area disadvantage, low temperamental persistence, and NESB status. The following risk factors were not significant: One or more siblings, low family income, not reading to the child, high maternal work hours, and Aboriginal or Torres Strait Islander ethnicity. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude does not do particularly well in predicting low receptive vocabulary ability from 4-8 years.
A Robust Bayesian Approach for Structural Equation Models with Missing Data

ERIC Educational Resources Information Center

Lee, Sik-Yum; Xia, Ye-Mao

2008-01-01

In this paper, normal/independent distributions, including but not limited to the multivariate t distribution, the multivariate contaminated distribution, and the multivariate slash distribution, are used to develop a robust Bayesian approach for analyzing structural equation models with complete or missing data. In the context of a nonlinear…
Lameness detection in dairy cattle: single predictor v. multivariate analysis of image-based posture processing and behaviour and performance sensing.

PubMed

Van Hertem, T; Bahr, C; Schlageter Tello, A; Viazzi, S; Steensels, M; Romanini, C E B; Lokhorst, C; Maltz, E; Halachmi, I; Berckmans, D

2016-09-01

The objective of this study was to evaluate if a multi-sensor system (milk, activity, body posture) was a better classifier for lameness than the single-sensor-based detection models. Between September 2013 and August 2014, 3629 cow observations were collected on a commercial dairy farm in Belgium. Human locomotion scoring was used as reference for the model development and evaluation. Cow behaviour and performance was measured with existing sensors that were already present at the farm. A prototype of three-dimensional-based video recording system was used to quantify automatically the back posture of a cow. For the single predictor comparisons, a receiver operating characteristics curve was made. For the multivariate detection models, logistic regression and generalized linear mixed models (GLMM) were developed. The best lameness classification model was obtained by the multi-sensor analysis (area under the receiver operating characteristics curve (AUC)=0.757±0.029), containing a combination of milk and milking variables, activity and gait and posture variables from videos. Second, the multivariate video-based system (AUC=0.732±0.011) performed better than the multivariate milk sensors (AUC=0.604±0.026) and the multivariate behaviour sensors (AUC=0.633±0.018). The video-based system performed better than the combined behaviour and performance-based detection model (AUC=0.669±0.028), indicating that it is worthwhile to consider a video-based lameness detection system, regardless the presence of other existing sensors in the farm. The results suggest that Θ2, the feature variable for the back curvature around the hip joints, with an AUC of 0.719 is the best single predictor variable for lameness detection based on locomotion scoring. In general, this study showed that the video-based back posture monitoring system is outperforming the behaviour and performance sensing techniques for locomotion scoring-based lameness detection. A GLMM with seven specific variables (walking speed, back posture measurement, daytime activity, milk yield, lactation stage, milk peak flow rate and milk peak conductivity) is the best combination of variables for lameness classification. The accuracy on four-level lameness classification was 60.3%. The accuracy improved to 79.8% for binary lameness classification. The binary GLMM obtained a sensitivity of 68.5% and a specificity of 87.6%, which both exceed the sensitivity (52.1%±4.7%) and specificity (83.2%±2.3%) of the multi-sensor logistic regression model. This shows that the repeated measures analysis in the GLMM, taking into account the individual history of the animal, outperforms the classification when thresholds based on herd level (a statistical population) are used.
Multivariate matrix model for source identification of inrush water: A case study from Renlou and Tongting coal mine in northern Anhui province, China

NASA Astrophysics Data System (ADS)

Zhang, Jun; Yao, Duoxi; Su, Yue

2018-02-01

Under the current situation of energy demand, coal is still one of the major energy sources in China for a certain period of time, so the task of coal mine safety production remains arduous. In order to identify the water source of the mine accurately, this article takes the example from Renlou and Tongting coal mines in the northern Anhui mining area. A total of 7 conventional water chemical indexes were selected, including Ca2+, Mg2+, Na++K+, Cl-, SO4 2-, HCO3 - and TDS, to establish a multivariate matrix model for the source identifying inrush water. The results show that the model is simple and is rarely limited by the quantity of water samples, and the recognition effect is ideal, which can be applied to the control and treatment for water inrush.
A comparison of multivariate and univariate time series approaches to modelling and forecasting emergency department demand in Western Australia.

PubMed

Aboagye-Sarfo, Patrick; Mai, Qun; Sanfilippo, Frank M; Preen, David B; Stewart, Louise M; Fatovich, Daniel M

2015-10-01

To develop multivariate vector-ARMA (VARMA) forecast models for predicting emergency department (ED) demand in Western Australia (WA) and compare them to the benchmark univariate autoregressive moving average (ARMA) and Winters' models. Seven-year monthly WA state-wide public hospital ED presentation data from 2006/07 to 2012/13 were modelled. Graphical and VARMA modelling methods were used for descriptive analysis and model fitting. The VARMA models were compared to the benchmark univariate ARMA and Winters' models to determine their accuracy to predict ED demand. The best models were evaluated by using error correction methods for accuracy. Descriptive analysis of all the dependent variables showed an increasing pattern of ED use with seasonal trends over time. The VARMA models provided a more precise and accurate forecast with smaller confidence intervals and better measures of accuracy in predicting ED demand in WA than the ARMA and Winters' method. VARMA models are a reliable forecasting method to predict ED demand for strategic planning and resource allocation. While the ARMA models are a closely competing alternative, they under-estimated future ED demand. Copyright © 2015 Elsevier Inc. All rights reserved.
Multivariate non-normally distributed random variables in climate research - introduction to the copula approach

NASA Astrophysics Data System (ADS)

Schölzel, C.; Friederichs, P.

2008-10-01

Probability distributions of multivariate random variables are generally more complex compared to their univariate counterparts which is due to a possible nonlinear dependence between the random variables. One approach to this problem is the use of copulas, which have become popular over recent years, especially in fields like econometrics, finance, risk management, or insurance. Since this newly emerging field includes various practices, a controversial discussion, and vast field of literature, it is difficult to get an overview. The aim of this paper is therefore to provide an brief overview of copulas for application in meteorology and climate research. We examine the advantages and disadvantages compared to alternative approaches like e.g. mixture models, summarize the current problem of goodness-of-fit (GOF) tests for copulas, and discuss the connection with multivariate extremes. An application to station data shows the simplicity and the capabilities as well as the limitations of this approach. Observations of daily precipitation and temperature are fitted to a bivariate model and demonstrate, that copulas are valuable complement to the commonly used methods.
Evaluation of multivariate linear regression and artificial neural networks in prediction of water quality parameters

PubMed Central

2014-01-01

This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD. PMID:24456676
Application of multivariate Gaussian detection theory to known non-Gaussian probability density functions

NASA Astrophysics Data System (ADS)

Schwartz, Craig R.; Thelen, Brian J.; Kenton, Arthur C.

1995-06-01

A statistical parametric multispectral sensor performance model was developed by ERIM to support mine field detection studies, multispectral sensor design/performance trade-off studies, and target detection algorithm development. The model assumes target detection algorithms and their performance models which are based on data assumed to obey multivariate Gaussian probability distribution functions (PDFs). The applicability of these algorithms and performance models can be generalized to data having non-Gaussian PDFs through the use of transforms which convert non-Gaussian data to Gaussian (or near-Gaussian) data. An example of one such transform is the Box-Cox power law transform. In practice, such a transform can be applied to non-Gaussian data prior to the introduction of a detection algorithm that is formally based on the assumption of multivariate Gaussian data. This paper presents an extension of these techniques to the case where the joint multivariate probability density function of the non-Gaussian input data is known, and where the joint estimate of the multivariate Gaussian statistics, under the Box-Cox transform, is desired. The jointly estimated multivariate Gaussian statistics can then be used to predict the performance of a target detection algorithm which has an associated Gaussian performance model.
A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution.

PubMed

Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

2017-01-01

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section.
Detecting event-related changes of multivariate phase coupling in dynamic brain networks.

PubMed

Canolty, Ryan T; Cadieu, Charles F; Koepsell, Kilian; Ganguly, Karunesh; Knight, Robert T; Carmena, Jose M

2012-04-01

Oscillatory phase coupling within large-scale brain networks is a topic of increasing interest within systems, cognitive, and theoretical neuroscience. Evidence shows that brain rhythms play a role in controlling neuronal excitability and response modulation (Haider B, McCormick D. Neuron 62: 171-189, 2009) and regulate the efficacy of communication between cortical regions (Fries P. Trends Cogn Sci 9: 474-480, 2005) and distinct spatiotemporal scales (Canolty RT, Knight RT. Trends Cogn Sci 14: 506-515, 2010). In this view, anatomically connected brain areas form the scaffolding upon which neuronal oscillations rapidly create and dissolve transient functional networks (Lakatos P, Karmos G, Mehta A, Ulbert I, Schroeder C. Science 320: 110-113, 2008). Importantly, testing these hypotheses requires methods designed to accurately reflect dynamic changes in multivariate phase coupling within brain networks. Unfortunately, phase coupling between neurophysiological signals is commonly investigated using suboptimal techniques. Here we describe how a recently developed probabilistic model, phase coupling estimation (PCE; Cadieu C, Koepsell K Neural Comput 44: 3107-3126, 2010), can be used to investigate changes in multivariate phase coupling, and we detail the advantages of this model over the commonly employed phase-locking value (PLV; Lachaux JP, Rodriguez E, Martinerie J, Varela F. Human Brain Map 8: 194-208, 1999). We show that the N-dimensional PCE is a natural generalization of the inherently bivariate PLV. Using simulations, we show that PCE accurately captures both direct and indirect (network mediated) coupling between network elements in situations where PLV produces erroneous results. We present empirical results on recordings from humans and nonhuman primates and show that the PCE-estimated coupling values are different from those using the bivariate PLV. Critically on these empirical recordings, PCE output tends to be sparser than the PLVs, indicating fewer significant interactions and perhaps a more parsimonious description of the data. Finally, the physical interpretation of PCE parameters is straightforward: the PCE parameters correspond to interaction terms in a network of coupled oscillators. Forward modeling of a network of coupled oscillators with parameters estimated by PCE generates synthetic data with statistical characteristics identical to empirical signals. Given these advantages over the PLV, PCE is a useful tool for investigating multivariate phase coupling in distributed brain networks.

Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

PubMed Central

Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

2013-01-01

In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

PubMed

Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

2013-10-15

In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Testing for significance of phase synchronisation dynamics in the EEG.

PubMed

Daly, Ian; Sweeney-Reed, Catherine M; Nasuto, Slawomir J

2013-06-01

A number of tests exist to check for statistical significance of phase synchronisation within the Electroencephalogram (EEG); however, the majority suffer from a lack of generality and applicability. They may also fail to account for temporal dynamics in the phase synchronisation, regarding synchronisation as a constant state instead of a dynamical process. Therefore, a novel test is developed for identifying the statistical significance of phase synchronisation based upon a combination of work characterising temporal dynamics of multivariate time-series and Markov modelling. We show how this method is better able to assess the significance of phase synchronisation than a range of commonly used significance tests. We also show how the method may be applied to identify and classify significantly different phase synchronisation dynamics in both univariate and multivariate datasets.
A mixed model for the relationship between climate and human cranial form.

PubMed

Katz, David C; Grote, Mark N; Weaver, Timothy D

2016-08-01

We expand upon a multivariate mixed model from quantitative genetics in order to estimate the magnitude of climate effects in a global sample of recent human crania. In humans, genetic distances are correlated with distances based on cranial form, suggesting that population structure influences both genetic and quantitative trait variation. Studies controlling for this structure have demonstrated significant underlying associations of cranial distances with ecological distances derived from climate variables. However, to assess the biological importance of an ecological predictor, estimates of effect size and uncertainty in the original units of measurement are clearly preferable to significance claims based on units of distance. Unfortunately, the magnitudes of ecological effects are difficult to obtain with distance-based methods, while models that produce estimates of effect size generally do not scale to high-dimensional data like cranial shape and form. Using recent innovations that extend quantitative genetics mixed models to highly multivariate observations, we estimate morphological effects associated with a climate predictor for a subset of the Howells craniometric dataset. Several measurements, particularly those associated with cranial vault breadth, show a substantial linear association with climate, and the multivariate model incorporating a climate predictor is preferred in model comparison. Previous studies demonstrated the existence of a relationship between climate and cranial form. The mixed model quantifies this relationship concretely. Evolutionary questions that require population structure and phylogeny to be disentangled from potential drivers of selection may be particularly well addressed by mixed models. Am J Phys Anthropol 160:593-603, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Multivariate hydrological frequency analysis for extreme events using Archimedean copula. Case study: Lower Tunjuelo River basin (Colombia)

NASA Astrophysics Data System (ADS)

Gómez, Wilmar

2017-04-01

By analyzing the spatial and temporal variability of extreme precipitation events we can prevent or reduce the threat and risk. Many water resources projects require joint probability distributions of random variables such as precipitation intensity and duration, which can not be independent with each other. The problem of defining a probability model for observations of several dependent variables is greatly simplified by the joint distribution in terms of their marginal by taking copulas. This document presents a general framework set frequency analysis bivariate and multivariate using Archimedean copulas for extreme events of hydroclimatological nature such as severe storms. This analysis was conducted in the lower Tunjuelo River basin in Colombia for precipitation events. The results obtained show that for a joint study of the intensity-duration-frequency, IDF curves can be obtained through copulas and thus establish more accurate and reliable information from design storms and associated risks. It shows how the use of copulas greatly simplifies the study of multivariate distributions that introduce the concept of joint return period used to represent the needs of hydrological designs properly in frequency analysis.
A Diagnostic Calculator for Detecting Glaucoma on the Basis of Retinal Nerve Fiber Layer, Optic Disc, and Retinal Ganglion Cell Analysis by Optical Coherence Tomography.

PubMed

Larrosa, José Manuel; Moreno-Montañés, Javier; Martinez-de-la-Casa, José María; Polo, Vicente; Velázquez-Villoria, Álvaro; Berrozpe, Clara; García-Granero, Marta

2015-10-01

The purpose of this study was to develop and validate a multivariate predictive model to detect glaucoma by using a combination of retinal nerve fiber layer (RNFL), retinal ganglion cell-inner plexiform (GCIPL), and optic disc parameters measured using spectral-domain optical coherence tomography (OCT). Five hundred eyes from 500 participants and 187 eyes of another 187 participants were included in the study and validation groups, respectively. Patients with glaucoma were classified in five groups based on visual field damage. Sensitivity and specificity of all glaucoma OCT parameters were analyzed. Receiver operating characteristic curves (ROC) and areas under the ROC (AUC) were compared. Three predictive multivariate models (quantitative, qualitative, and combined) that used a combination of the best OCT parameters were constructed. A diagnostic calculator was created using the combined multivariate model. The best AUC parameters were: inferior RNFL, average RNFL, vertical cup/disc ratio, minimal GCIPL, and inferior-temporal GCIPL. Comparisons among the parameters did not show that the GCIPL parameters were better than those of the RNFL in early and advanced glaucoma. The highest AUC was in the combined predictive model (0.937; 95% confidence interval, 0.911-0.957) and was significantly (P = 0.0001) higher than the other isolated parameters considered in early and advanced glaucoma. The validation group displayed similar results to those of the study group. Best GCIPL, RNFL, and optic disc parameters showed a similar ability to detect glaucoma. The combined predictive formula improved the glaucoma detection compared to the best isolated parameters evaluated. The diagnostic calculator obtained good classification from participants in both the study and validation groups.
Semiparametric Thurstonian Models for Recurrent Choices: A Bayesian Analysis

ERIC Educational Resources Information Center

Ansari, Asim; Iyengar, Raghuram

2006-01-01

We develop semiparametric Bayesian Thurstonian models for analyzing repeated choice decisions involving multinomial, multivariate binary or multivariate ordinal data. Our modeling framework has multiple components that together yield considerable flexibility in modeling preference utilities, cross-sectional heterogeneity and parameter-driven…
Combined Prediction Model of Death Toll for Road Traffic Accidents Based on Independent and Dependent Variables

PubMed Central

Zhong-xiang, Feng; Shi-sheng, Lu; Wei-hua, Zhang; Nan-nan, Zhang

2014-01-01

In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability. PMID:25610454
Combined prediction model of death toll for road traffic accidents based on independent and dependent variables.

PubMed

Feng, Zhong-xiang; Lu, Shi-sheng; Zhang, Wei-hua; Zhang, Nan-nan

2014-01-01

In order to build a combined model which can meet the variation rule of death toll data for road traffic accidents and can reflect the influence of multiple factors on traffic accidents and improve prediction accuracy for accidents, the Verhulst model was built based on the number of death tolls for road traffic accidents in China from 2002 to 2011; and car ownership, population, GDP, highway freight volume, highway passenger transportation volume, and highway mileage were chosen as the factors to build the death toll multivariate linear regression model. Then the two models were combined to be a combined prediction model which has weight coefficient. Shapley value method was applied to calculate the weight coefficient by assessing contributions. Finally, the combined model was used to recalculate the number of death tolls from 2002 to 2011, and the combined model was compared with the Verhulst and multivariate linear regression models. The results showed that the new model could not only characterize the death toll data characteristics but also quantify the degree of influence to the death toll by each influencing factor and had high accuracy as well as strong practicability.
Interaction of Soil Heavy Metal Pollution with Industrialisation and the Landscape Pattern in Taiyuan City, China

PubMed Central

Liu, Yong; Su, Chao; Zhang, Hong; Li, Xiaoting; Pei, Jingfei

2014-01-01

Many studies indicated that industrialization and urbanization caused serious soil heavy metal pollution from industrialized age. However, fewer previous studies have conducted a combined analysis of the landscape pattern, urbanization, industrialization, and heavy metal pollution. This paper was aimed at exploring the relationships of heavy metals in the soil (Pb, Cu, Ni, As, Cd, Cr, Hg, and Zn) with landscape pattern, industrialisation, urbanisation in Taiyuan city using multivariate analysis. The multivariate analysis included correlation analysis, analysis of variance (ANOVA), independent-sample T test, and principal component analysis (PCA). Geographic information system (GIS) was also applied to determine the spatial distribution of the heavy metals. The spatial distribution maps showed that the heavy metal pollution of the soil was more serious in the centre of the study area. The results of the multivariate analysis indicated that the correlations among heavy metals were significant, and industrialisation could significantly affect the concentrations of some heavy metals. Landscape diversity showed a significant negative correlation with the heavy metal concentrations. The PCA showed that a two-factor model for heavy metal pollution, industrialisation, and the landscape pattern could effectively demonstrate the relationships between these variables. The model explained 86.71% of the total variance of the data. Moreover, the first factor was mainly loaded with the comprehensive pollution index (P), and the second factor was primarily loaded with landscape diversity and dominance (H and D). An ordination of 80 samples could show the pollution pattern of all the samples. The results revealed that local industrialisation caused heavy metal pollution of the soil, but such pollution could respond negatively to the landscape pattern. The results of the study could provide a basis for agricultural, suburban, and urban planning. PMID:25251460
Interaction of soil heavy metal pollution with industrialisation and the landscape pattern in Taiyuan city, China.

PubMed

Liu, Yong; Su, Chao; Zhang, Hong; Li, Xiaoting; Pei, Jingfei

2014-01-01

Many studies indicated that industrialization and urbanization caused serious soil heavy metal pollution from industrialized age. However, fewer previous studies have conducted a combined analysis of the landscape pattern, urbanization, industrialization, and heavy metal pollution. This paper was aimed at exploring the relationships of heavy metals in the soil (Pb, Cu, Ni, As, Cd, Cr, Hg, and Zn) with landscape pattern, industrialisation, urbanisation in Taiyuan city using multivariate analysis. The multivariate analysis included correlation analysis, analysis of variance (ANOVA), independent-sample T test, and principal component analysis (PCA). Geographic information system (GIS) was also applied to determine the spatial distribution of the heavy metals. The spatial distribution maps showed that the heavy metal pollution of the soil was more serious in the centre of the study area. The results of the multivariate analysis indicated that the correlations among heavy metals were significant, and industrialisation could significantly affect the concentrations of some heavy metals. Landscape diversity showed a significant negative correlation with the heavy metal concentrations. The PCA showed that a two-factor model for heavy metal pollution, industrialisation, and the landscape pattern could effectively demonstrate the relationships between these variables. The model explained 86.71% of the total variance of the data. Moreover, the first factor was mainly loaded with the comprehensive pollution index (P), and the second factor was primarily loaded with landscape diversity and dominance (H and D). An ordination of 80 samples could show the pollution pattern of all the samples. The results revealed that local industrialisation caused heavy metal pollution of the soil, but such pollution could respond negatively to the landscape pattern. The results of the study could provide a basis for agricultural, suburban, and urban planning.
Multivariate Statistical Modelling of Drought and Heat Wave Events

NASA Astrophysics Data System (ADS)

Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele

2016-04-01

Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A copula is a multivariate distribution function which allows one to model the dependence structure of given variables separately from the marginal behaviour. We firstly look at the structure of soil moisture drought over the entire of France using the SAFRAN dataset between 1959 and 2009. Soil moisture is represented using the Standardised Precipitation Evapotranspiration Index (SPEI). Drought characteristics are computed at grid point scale where drought conditions are identified as those with an SPEI value below -1.0. We model the multivariate dependence structure of drought events defined by certain characteristics and compute return levels of these events. We initially find that drought characteristics such as duration, mean SPEI and the maximum contiguous area to a grid point all have positive correlations, though the degree to which they are correlated can vary considerably spatially. A spatial representation of return levels then may provide insight into the areas most prone to drought conditions. As a next step, we analyse the dependence structure between soil moisture conditions preceding the onset of a heat wave and the heat wave itself.
Development Of A Multivariate Prognostic Model For Pain And Activity Limitation In People With Low Back Disorders Receiving Physiotherapy.

PubMed

Ford, Jon J; Richards BPhysio, Matt C; Surkitt BPhysio, Luke D; Chan BPhysio, Alexander Yp; Slater, Sarah L; Taylor, Nicholas F; Hahne, Andrew J

2018-05-28

To identify predictors for back pain, leg pain and activity limitation in patients with early persistent low back disorders. Prospective inception cohort study; Setting: primary care private physiotherapy clinics in Melbourne, Australia. 300 adults aged 18-65 years with low back and/or referred leg pain of ≥6-weeks and ≤6-months duration. Not applicable. Numerical rating scales for back pain and leg pain as well as the Oswestry Disability Scale. Prognostic factors included sociodemographics, treatment related factors, subjective/physical examination, subgrouping factors and standardized questionnaires. Univariate analysis followed by generalized estimating equations were used to develop a multivariate prognostic model for back pain, leg pain and activity limitation. Fifty-eight prognostic factors progressed to the multivariate stage where 15 showed significant (p<0.05) associations with at least one of the three outcomes. There were five indicators of positive outcome (two types of low back disorder subgroups, paresthesia below waist, walking as an easing factor and low transversus abdominis tone) and 10 indicators of negative outcome (both parents born overseas, deep leg symptoms, longer sick leave duration, high multifidus tone, clinically determined inflammation, higher back and leg pain severity, lower lifting capacity, lower work capacity and higher pain drawing percentage coverage). The preliminary model identifying predictors of low back disorders explained up to 37% of the variance in outcome. This study evaluated a comprehensive range of prognostic factors reflective of both the biomedical and psychosocial domains of low back disorders. The preliminary multivariate model requires further validation before being considered for clinical use. Copyright © 2018. Published by Elsevier Inc.
Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children.

PubMed

Zubrick, Stephen R; Taylor, Catherine L; Christensen, Daniel

2015-01-01

Oral language is the foundation of literacy. Naturally, policies and practices to promote children's literacy begin in early childhood and have a strong focus on developing children's oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children's progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children's oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children's progress along the oral to literate continuum is stable and predictable. Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years.
Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

PubMed

Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

2018-06-29

A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
Effects of Covariance Heterogeneity on Three Procedures for Analyzing Multivariate Repeated Measures Designs.

ERIC Educational Resources Information Center

Vallejo, Guillermo; Fidalgo, Angel; Fernandez, Paula

2001-01-01

Estimated empirical Type I error rate and power rate for three procedures for analyzing multivariate repeated measures designs: (1) the doubly multivariate model; (2) the Welch-James multivariate solution (H. Keselman, M. Carriere, a nd L. Lix, 1993); and (3) the multivariate version of the modified Brown-Forsythe procedure (M. Brown and A.…
Multivariate methods for indoor PM10 and PM2.5 modelling in naturally ventilated schools buildings

NASA Astrophysics Data System (ADS)

Elbayoumi, Maher; Ramli, Nor Azam; Md Yusof, Noor Faizah Fitri; Yahaya, Ahmad Shukri Bin; Al Madhoun, Wesam; Ul-Saufie, Ahmed Zia

2014-09-01

In this study the concentrations of PM10, PM2.5, CO and CO2 concentrations and meteorological variables (wind speed, air temperature, and relative humidity) were employed to predict the annual and seasonal indoor concentration of PM10 and PM2.5 using multivariate statistical methods. The data have been collected in twelve naturally ventilated schools in Gaza Strip (Palestine) from October 2011 to May 2012 (academic year). The bivariate correlation analysis showed that the indoor PM10 and PM2.5 were highly positive correlated with outdoor concentration of PM10 and PM2.5. Further, Multiple linear regression (MLR) was used for modelling and R2 values for indoor PM10 were determined as 0.62 and 0.84 for PM10 and PM2.5 respectively. The Performance indicators of MLR models indicated that the prediction for PM10 and PM2.5 annual models were better than seasonal models. In order to reduce the number of input variables, principal component analysis (PCA) and principal component regression (PCR) were applied by using annual data. The predicted R2 were 0.40 and 0.73 for PM10 and PM2.5, respectively. PM10 models (MLR and PCR) show the tendency to underestimate indoor PM10 concentrations as it does not take into account the occupant's activities which highly affect the indoor concentrations during the class hours.
On the Numerical Formulation of Parametric Linear Fractional Transformation (LFT) Uncertainty Models for Multivariate Matrix Polynomial Problems

NASA Technical Reports Server (NTRS)

Belcastro, Christine M.

1998-01-01

Robust control system analysis and design is based on an uncertainty description, called a linear fractional transformation (LFT), which separates the uncertain (or varying) part of the system from the nominal system. These models are also useful in the design of gain-scheduled control systems based on Linear Parameter Varying (LPV) methods. Low-order LFT models are difficult to form for problems involving nonlinear parameter variations. This paper presents a numerical computational method for constructing and LFT model for a given LPV model. The method is developed for multivariate polynomial problems, and uses simple matrix computations to obtain an exact low-order LFT representation of the given LPV system without the use of model reduction. Although the method is developed for multivariate polynomial problems, multivariate rational problems can also be solved using this method by reformulating the rational problem into a polynomial form.
External Influences on Modeled and Observed Cloud Trends

NASA Technical Reports Server (NTRS)

Marvel, Kate; Zelinka, Mark; Klein, Stephen A.; Bonfils, Celine; Caldwell, Peter; Doutriaux, Charles; Santer, Benjamin D.; Taylor, Karl E.

2015-01-01

Understanding the cloud response to external forcing is a major challenge for climate science. This crucial goal is complicated by intermodel differences in simulating present and future cloud cover and by observational uncertainty. This is the first formal detection and attribution study of cloud changes over the satellite era. Presented herein are CMIP5 (Coupled Model Intercomparison Project - Phase 5) model-derived fingerprints of externally forced changes to three cloud properties: the latitudes at which the zonally averaged total cloud fraction (CLT) is maximized or minimized, the zonal average CLT at these latitudes, and the height of high clouds at these latitudes. By considering simultaneous changes in all three properties, the authors define a coherent multivariate fingerprint of cloud response to external forcing and use models from phase 5 of CMIP (CMIP5) to calculate the average time to detect these changes. It is found that given perfect satellite cloud observations beginning in 1983, the models indicate that a detectable multivariate signal should have already emerged. A search is then made for signals of external forcing in two observational datasets: ISCCP (International Satellite Cloud Climatology Project) and PATMOS-x (Advanced Very High Resolution Radiometer (AVHRR) Pathfinder Atmospheres - Extended). The datasets are both found to show a poleward migration of the zonal CLT pattern that is incompatible with forced CMIP5 models. Nevertheless, a detectable multivariate signal is predicted by models over the PATMOS-x time period and is indeed present in the dataset. Despite persistent observational uncertainties, these results present a strong case for continued efforts to improve these existing satellite observations, in addition to planning for new missions.
Multivariate Methods for Meta-Analysis of Genetic Association Studies.

PubMed

Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G

2018-01-01

Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.

A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research.

PubMed

Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; Nookala, Lavanya; Day, Michele E; Kim, Katherine K; Kim, Hyeoneui; Boxwala, Aziz; El-Kareh, Robert; Kuo, Grace M; Resnic, Frederic S; Kesselman, Carl; Ohno-Machado, Lucila

2015-11-01

Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
MULTIVARIATE RECEPTOR MODELS AND MODEL UNCERTAINTY. (R825173)

EPA Science Inventory

Abstract
Estimation of the number of major pollution sources, the source composition profiles, and the source contributions are the main interests in multivariate receptor modeling. Due to lack of identifiability of the receptor model, however, the estimation cannot be...
Prostate Health Index improves multivariable risk prediction of aggressive prostate cancer.

PubMed

Loeb, Stacy; Shin, Sanghyuk S; Broyles, Dennis L; Wei, John T; Sanda, Martin; Klee, George; Partin, Alan W; Sokoll, Lori; Chan, Daniel W; Bangma, Chris H; van Schaik, Ron H N; Slawin, Kevin M; Marks, Leonard S; Catalona, William J

2017-07-01

To examine the use of the Prostate Health Index (PHI) as a continuous variable in multivariable risk assessment for aggressive prostate cancer in a large multicentre US study. The study population included 728 men, with prostate-specific antigen (PSA) levels of 2-10 ng/mL and a negative digital rectal examination, enrolled in a prospective, multi-site early detection trial. The primary endpoint was aggressive prostate cancer, defined as biopsy Gleason score ≥7. First, we evaluated whether the addition of PHI improves the performance of currently available risk calculators (the Prostate Cancer Prevention Trial [PCPT] and European Randomised Study of Screening for Prostate Cancer [ERSPC] risk calculators). We also designed and internally validated a new PHI-based multivariable predictive model, and created a nomogram. Of 728 men undergoing biopsy, 118 (16.2%) had aggressive prostate cancer. The PHI predicted the risk of aggressive prostate cancer across the spectrum of values. Adding PHI significantly improved the predictive accuracy of the PCPT and ERSPC risk calculators for aggressive disease. A new model was created using age, previous biopsy, prostate volume, PSA and PHI, with an area under the curve of 0.746. The bootstrap-corrected model showed good calibration with observed risk for aggressive prostate cancer and had net benefit on decision-curve analysis. Using PHI as part of multivariable risk assessment leads to a significant improvement in the detection of aggressive prostate cancer, potentially reducing harms from unnecessary prostate biopsy and overdiagnosis. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

PubMed Central

Inouye, David; Yang, Eunho; Allen, Genevera; Ravikumar, Pradeep

2017-01-01

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section. PMID:28983398
An error bound for a discrete reduced order model of a linear multivariable system

NASA Technical Reports Server (NTRS)

Al-Saggaf, Ubaid M.; Franklin, Gene F.

1987-01-01

The design of feasible controllers for high dimension multivariable systems can be greatly aided by a method of model reduction. In order for the design based on the order reduction to include a guarantee of stability, it is sufficient to have a bound on the model error. Previous work has provided such a bound for continuous-time systems for algorithms based on balancing. In this note an L-infinity bound is derived for model error for a method of order reduction of discrete linear multivariable systems based on balancing.
Simulation Modeling to Compare High-Throughput, Low-Iteration Optimization Strategies for Metabolic Engineering

PubMed Central

Heinsch, Stephen C.; Das, Siba R.; Smanski, Michael J.

2018-01-01

Increasing the final titer of a multi-gene metabolic pathway can be viewed as a multivariate optimization problem. While numerous multivariate optimization algorithms exist, few are specifically designed to accommodate the constraints posed by genetic engineering workflows. We present a strategy for optimizing expression levels across an arbitrary number of genes that requires few design-build-test iterations. We compare the performance of several optimization algorithms on a series of simulated expression landscapes. We show that optimal experimental design parameters depend on the degree of landscape ruggedness. This work provides a theoretical framework for designing and executing numerical optimization on multi-gene systems. PMID:29535690
Improving the realism of hydrologic model through multivariate parameter estimation

NASA Astrophysics Data System (ADS)

Rakovec, Oldrich; Kumar, Rohini; Attinger, Sabine; Samaniego, Luis

2017-04-01

Increased availability and quality of near real-time observations should improve understanding of predictive skills of hydrological models. Recent studies have shown the limited capability of river discharge data alone to adequately constrain different components of distributed model parameterizations. In this study, the GRACE satellite-based total water storage (TWS) anomaly is used to complement the discharge data with an aim to improve the fidelity of mesoscale hydrologic model (mHM) through multivariate parameter estimation. The study is conducted in 83 European basins covering a wide range of hydro-climatic regimes. The model parameterization complemented with the TWS anomalies leads to statistically significant improvements in (1) discharge simulations during low-flow period, and (2) evapotranspiration estimates which are evaluated against independent (FLUXNET) data. Overall, there is no significant deterioration in model performance for the discharge simulations when complemented by information from the TWS anomalies. However, considerable changes in the partitioning of precipitation into runoff components are noticed by in-/exclusion of TWS during the parameter estimation. A cross-validation test carried out to assess the transferability and robustness of the calibrated parameters to other locations further confirms the benefit of complementary TWS data. In particular, the evapotranspiration estimates show more robust performance when TWS data are incorporated during the parameter estimation, in comparison with the benchmark model constrained against discharge only. This study highlights the value for incorporating multiple data sources during parameter estimation to improve the overall realism of hydrologic model and its applications over large domains. Rakovec, O., Kumar, R., Attinger, S. and Samaniego, L. (2016): Improving the realism of hydrologic model functioning through multivariate parameter estimation. Water Resour. Res., 52, http://dx.doi.org/10.1002/2016WR019430
The Contribution of Missed Clinic Visits to Disparities in HIV Viral Load Outcomes

PubMed Central

Westfall, Andrew O.; Gardner, Lytt I.; Giordano, Thomas P.; Wilson, Tracey E.; Drainoni, Mari-Lynn; Keruly, Jeanne C.; Rodriguez, Allan E.; Malitz, Faye; Batey, D. Scott; Mugavero, Michael J.

2015-01-01

Objectives. We explored the contribution of missed primary HIV care visits (“no-show”) to observed disparities in virological failure (VF) among Black persons and persons with injection drug use (IDU) history. Methods. We used patient-level data from 6 academic clinics, before the Centers for Disease Control and Prevention and Health Resources and Services Administration Retention in Care intervention. We employed staged multivariable logistic regression and multivariable models stratified by no-show visit frequency to evaluate the association of sociodemographic factors with VF. We used multiple imputations to assign missing viral load values. Results. Among 10 053 patients (mean age = 46 years; 35% female; 64% Black; 15% with IDU history), 31% experienced VF. Although Black patients and patients with IDU history were significantly more likely to experience VF in initial analyses, race and IDU parameter estimates were attenuated after sequential addition of no-show frequency. In stratified models, race and IDU were not statistically significantly associated with VF at any no-show level. Conclusions. Because missed clinic visits contributed to observed differences in viral load outcomes among Black and IDU patients, achieving an improved understanding of differential visit attendance is imperative to reducing disparities in HIV. PMID:26270301
Effects of climatological parameters in modeling and forecasting seasonal influenza transmission in Abidjan, Cote d'Ivoire.

PubMed

N'gattia, A K; Coulibaly, D; Nzussouo, N Talla; Kadjo, H A; Chérif, D; Traoré, Y; Kouakou, B K; Kouassi, P D; Ekra, K D; Dagnan, N S; Williams, T; Tiembré, I

2016-09-13

In temperate regions, influenza epidemics occur in the winter and correlate with certain climatological parameters. In African tropical regions, the effects of climatological parameters on influenza epidemics are not well defined. This study aims to identify and model the effects of climatological parameters on seasonal influenza activity in Abidjan, Cote d'Ivoire. We studied the effects of weekly rainfall, humidity, and temperature on laboratory-confirmed influenza cases in Abidjan from 2007 to 2010. We used the Box-Jenkins method with the autoregressive integrated moving average (ARIMA) process to create models using data from 2007-2010 and to assess the predictive value of best model on data from 2011 to 2012. The weekly number of influenza cases showed significant cross-correlation with certain prior weeks for both rainfall, and relative humidity. The best fitting multivariate model (ARIMAX (2,0,0) _RF) included the number of influenza cases during 1-week and 2-weeks prior, and the rainfall during the current week and 5-weeks prior. The performance of this model showed an increase of >3 % for Akaike Information Criterion (AIC) and 2.5 % for Bayesian Information Criterion (BIC) compared to the reference univariate ARIMA (2,0,0). The prediction of the weekly number of influenza cases during 2011-2012 with the best fitting multivariate model (ARIMAX (2,0,0) _RF), showed that the observed values were within the 95 % confidence interval of the predicted values during 97 of 104 weeks. Including rainfall increases the performances of fitted and predicted models. The timing of influenza in Abidjan can be partially explained by rainfall influence, in a setting with little change in temperature throughout the year. These findings can help clinicians to anticipate influenza cases during the rainy season by implementing preventive measures.
Predicting seasonal influenza transmission using functional regression models with temporal dependence.

PubMed

Oviedo de la Fuente, Manuel; Febrero-Bande, Manuel; Muñoz, María Pilar; Domínguez, Àngela

2018-01-01

This paper proposes a novel approach that uses meteorological information to predict the incidence of influenza in Galicia (Spain). It extends the Generalized Least Squares (GLS) methods in the multivariate framework to functional regression models with dependent errors. These kinds of models are useful when the recent history of the incidence of influenza are readily unavailable (for instance, by delays on the communication with health informants) and the prediction must be constructed by correcting the temporal dependence of the residuals and using more accessible variables. A simulation study shows that the GLS estimators render better estimations of the parameters associated with the regression model than they do with the classical models. They obtain extremely good results from the predictive point of view and are competitive with the classical time series approach for the incidence of influenza. An iterative version of the GLS estimator (called iGLS) was also proposed that can help to model complicated dependence structures. For constructing the model, the distance correlation measure [Formula: see text] was employed to select relevant information to predict influenza rate mixing multivariate and functional variables. These kinds of models are extremely useful to health managers in allocating resources in advance to manage influenza epidemics.
A symmetric multivariate leakage correction for MEG connectomes

PubMed Central

Colclough, G.L.; Brookes, M.J.; Smith, S.M.; Woolrich, M.W.

2015-01-01

Ambiguities in the source reconstruction of magnetoencephalographic (MEG) measurements can cause spurious correlations between estimated source time-courses. In this paper, we propose a symmetric orthogonalisation method to correct for these artificial correlations between a set of multiple regions of interest (ROIs). This process enables the straightforward application of network modelling methods, including partial correlation or multivariate autoregressive modelling, to infer connectomes, or functional networks, from the corrected ROIs. Here, we apply the correction to simulated MEG recordings of simple networks and to a resting-state dataset collected from eight subjects, before computing the partial correlations between power envelopes of the corrected ROItime-courses. We show accurate reconstruction of our simulated networks, and in the analysis of real MEGresting-state connectivity, we find dense bilateral connections within the motor and visual networks, together with longer-range direct fronto-parietal connections. PMID:25862259
Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline

NASA Astrophysics Data System (ADS)

Hasyim, M.; Prastyo, D. D.

2018-03-01

Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.
[Academic performance in first year medical students: an explanatory multivariate model].

PubMed

Urrutia Aguilar, María Esther; Ortiz León, Silvia; Fouilloux Morales, Claudia; Ponce Rosas, Efrén Raúl; Guevara Guzmán, Rosalinda

2014-12-01

Current education is focused in intellectual, affective, and ethical aspects, thus acknowledging their significance in students´ metacognition. Nowadays, it is known that an adequate and motivating environment together with a positive attitude towards studies is fundamental to induce learning. Medical students are under multiple stressful, academic, personal, and vocational situations. To identify psychosocial, vocational, and academic variables of 2010-2011 first year medical students at UNAM that may help predict their academic performance. Academic surveys of psychological and vocational factors were applied; an academic follow-up was carried out to obtain a multivariate model. The data were analyzed considering descriptive, comparative, correlative, and predictive statistics. The main variables that affect students´ academic performance are related to previous knowledge and to psychological variables. The results show the significance of implementing institutional programs to support students throughout their college adaptation.
Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI

PubMed Central

Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.

2015-01-01

In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490
Estuarine Sediment Deposition during Wetland Restoration: A GIS and Remote Sensing Modeling Approach

NASA Technical Reports Server (NTRS)

Newcomer, Michelle; Kuss, Amber; Kentron, Tyler; Remar, Alex; Choksi, Vivek; Skiles, J. W.

2011-01-01

Restoration of the industrial salt flats in the San Francisco Bay, California is an ongoing wetland rehabilitation project. Remote sensing maps of suspended sediment concentration, and other GIS predictor variables were used to model sediment deposition within these recently restored ponds. Suspended sediment concentrations were calibrated to reflectance values from Landsat TM 5 and ASTER using three statistical techniques -- linear regression, multivariate regression, and an Artificial Neural Network (ANN), to map suspended sediment concentrations. Multivariate and ANN regressions using ASTER proved to be the most accurate methods, yielding r2 values of 0.88 and 0.87, respectively. Predictor variables such as sediment grain size and tidal frequency were used in the Marsh Sedimentation (MARSED) model for predicting deposition rates for three years. MARSED results for a fully restored pond show a root mean square deviation (RMSD) of 66.8 mm (<1) between modeled and field observations. This model was further applied to a pond breached in November 2010 and indicated that the recently breached pond will reach equilibrium levels after 60 months of tidal inundation.
Preliminary Multi-Variable Parametric Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Hendrichs, Todd

2010-01-01

This slide presentation reviews creating a preliminary multi-variable cost model for the contract costs of making a space telescope. There is discussion of the methodology for collecting the data, definition of the statistical analysis methodology, single variable model results, testing of historical models and an introduction of the multi variable models.
The source identification of ambient aerosols in Beijing, China by multivariate analysis coupled with {sup 14}C tracer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiaoyan Tang; Min Shao; Yuanhang Zhang

1996-12-31

Ambient aerosol is one of most important pollutants in China. This paper showed the results of aerosol sources of Beijing area revealed by combination of multivariate analysis models and 14C tracer measured on Accelerator Mass Spectrometry (AMS). The results indicated that the mass concentration of particulate (<100 (M)) didn`t increase rapidly, compared with economic development in Beijing city. The multivariate analysis showed that the predominant source was soil dust which contributed more than 50% to atmospheric particles. However, it would be a risk to conclude that the aerosol pollution from anthropogenic sources was less important in Beijing city based onmore » above phenomenon. Due to lack of reliable tracers, it was very hard to distinguish coal burning from soil source. Thus, it was suspected that the soil source above might be the mixture of soil dust and coal burning. The 14C measurement showed that carbonaceous species of aerosol had quite different emission sources. For carbonaceous aerosols in Beijing, the contribution from fossil fuel to ambient particles was nearly 2/3, as the man-made activities ( coal-burning, etc.) increased, the fossil part would contribute more to atmospheric carbonaceous particles. For example, in downtown Beijing at space-heating seasons, the fossil fuel even contributed more than 95% to carbonaceous particles, which would be potential harmful to population. By using multivariate analysis together with 14C data, two important sources of aerosols in Beijing (soil and coal) combustion were more reliably distinguished, which was critical important for the assessment of aerosol problem in China.« less
Multivariate Models for Normal and Binary Responses in Intervention Studies

ERIC Educational Resources Information Center

Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen

2016-01-01

Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…
Pattern of Utilisation of Dental Health Care Among HIV-positive Adult Nigerians.

PubMed

Adedigba, Michael A; Adekanmbi, Victor T; Asa, Sola; Fakande, Ibiyemi

2016-01-01

To determine the pattern of dental care utilisation of people living with HIV (PLHIV). A cross-sectional questionnaire survey of 239 PLHIV patients in three care centres was done. Information on sociodemographics, dental visit, risk groups, living arrangement, medical insurance and need of dental care was recorded. The EC Clearinghouse and WHO clinical staging was used to determine the stage of HIV/AIDS infection following routine oral examinations under natural daylight. Multivariate logistic regression models were created after adjusting for all the covariates that were statistically significant at univariate/bivariate levels. The majority of subjects were younger than 50 years, about 93% had not seen a dentist before being diagnosed HIV positive and 92% reported no dental visit after contracting HIV. Among nonusers of dental care, 14.3% reported that they wanted care but were afraid to seek it. Other reasons included poor awareness, lack of money and stigmatisation. Multivariate analysis showed that lack of dental care was associated with employment status, living arrangements, educational status, income per annum and presenting with oral symptoms. The area under the receiver operating curve was 84% for multivariate logistic regression model 1, 70% for model 2, 67% for model 3 and 71% for model 4, which means that the predictive power of the models were good. Contrary to our expectations, dental utilisation among PLHIV was generally poor among this group of patients. There is serious and immediate need to improve the awareness of PLHIVs in African settings and barriers to dental care utilisation should also be removed or reduced.
The influence of API concentration on the roller compaction process: modeling and prediction of the post compacted ribbon, granule and tablet properties using multivariate data analysis.

PubMed

Boersen, Nathan; Carvajal, M Teresa; Morris, Kenneth R; Peck, Garnet E; Pinal, Rodolfo

2015-01-01

While previous research has demonstrated roller compaction operating parameters strongly influence the properties of the final product, a greater emphasis might be placed on the raw material attributes of the formulation. There were two main objectives to this study. First, to assess the effects of different process variables on the properties of the obtained ribbons and downstream granules produced from the rolled compacted ribbons. Second, was to establish if models obtained with formulations of one active pharmaceutical ingredient (API) could predict the properties of similar formulations in terms of the excipients used, but with a different API. Tolmetin and acetaminophen, chosen for their different compaction properties, were roller compacted on Fitzpatrick roller compactor using the same formulation. Models created using tolmetin and tested using acetaminophen. The physical properties of the blends, ribbon, granule and tablet were characterized. Multivariate analysis using partial least squares was used to analyze all data. Multivariate models showed that the operating parameters and raw material attributes were essential in the prediction of ribbon porosity and post-milled particle size. The post compacted ribbon and granule attributes also significantly contributed to the prediction of the tablet tensile strength. Models derived using tolmetin could reasonably predict the ribbon porosity of a second API. After further processing, the post-milled ribbon and granules properties, rather than the physical attributes of the formulation were needed to predict downstream tablet properties. An understanding of the percolation threshold of the formulation significantly improved the predictive ability of the models.

Estimating multivariate similarity between neuroimaging datasets with sparse canonical correlation analysis: an application to perfusion imaging.

PubMed

Rosa, Maria J; Mehta, Mitul A; Pich, Emilio M; Risterucci, Celine; Zelaya, Fernando; Reinders, Antje A T S; Williams, Steve C R; Dazzan, Paola; Doyle, Orla M; Marquand, Andre F

2015-01-01

An increasing number of neuroimaging studies are based on either combining more than one data modality (inter-modal) or combining more than one measurement from the same modality (intra-modal). To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA). However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA), overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labeling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow.
Association between Daily Hospital Outpatient Visits for Accidents and Daily Ambient Air Temperatures in an Industrial City.

PubMed

Chau, Tang-Tat; Wang, Kuo-Ying

2016-01-01

An accident is an unwanted hazard to a person. However, accidents occur. In this work, we search for correlations between daily accident rates and environmental factors. To study daily hospital outpatients who were admitted for accidents during a 5-year period, 2007-2011, we analyzed data regarding 168,366 outpatients using univariate regression models; we also used multivariable regression models to account for confounding factors. Our analysis indicates that the number of male outpatients admitted for accidents was approximately 1.31 to 1.47 times the number of female outpatients (P < 0.0001). Of the 12 parameters (regarding air pollution and meteorology) considered, only daily temperature exhibited consistent and significant correlations with the daily number of hospital outpatient visits for accidents throughout the 5-year analysis period. The univariate regression models indicate that older people (greater than 66 years old) had the fewest accidents per 1-degree increase in temperature, followed by young people (0-15 years old). Middle-aged people (16-65 years old) were the group of outpatients that were more prone to accidents, with an increase in accident rates of 0.8-1.2 accidents per degree increase in temperature. The multivariable regression models also reveal that the temperature variation was the dominant factor in determining the daily number of outpatient visits for accidents. Our further multivariable model analysis of temperature with respect to air pollution variables show that, through the increases in emissions and concentrations of CO, photochemical O3 production and NO2 loss in the ambient air, increases in vehicular emissions are associated with increases in temperatures. As such, increases in hospital visits for accidents are related to vehicular emissions and usage. This finding is consistent with clinical experience which shows about 60% to 80% of accidents are related to traffic, followed by accidents occurred in work place.
Association between Daily Hospital Outpatient Visits for Accidents and Daily Ambient Air Temperatures in an Industrial City

PubMed Central

Chau, Tang-Tat; Wang, Kuo-Ying

2016-01-01

An accident is an unwanted hazard to a person. However, accidents occur. In this work, we search for correlations between daily accident rates and environmental factors. To study daily hospital outpatients who were admitted for accidents during a 5-year period, 2007–2011, we analyzed data regarding 168,366 outpatients using univariate regression models; we also used multivariable regression models to account for confounding factors. Our analysis indicates that the number of male outpatients admitted for accidents was approximately 1.31 to 1.47 times the number of female outpatients (P < 0.0001). Of the 12 parameters (regarding air pollution and meteorology) considered, only daily temperature exhibited consistent and significant correlations with the daily number of hospital outpatient visits for accidents throughout the 5-year analysis period. The univariate regression models indicate that older people (greater than 66 years old) had the fewest accidents per 1-degree increase in temperature, followed by young people (0–15 years old). Middle-aged people (16–65 years old) were the group of outpatients that were more prone to accidents, with an increase in accident rates of 0.8–1.2 accidents per degree increase in temperature. The multivariable regression models also reveal that the temperature variation was the dominant factor in determining the daily number of outpatient visits for accidents. Our further multivariable model analysis of temperature with respect to air pollution variables show that, through the increases in emissions and concentrations of CO, photochemical O3 production and NO2 loss in the ambient air, increases in vehicular emissions are associated with increases in temperatures. As such, increases in hospital visits for accidents are related to vehicular emissions and usage. This finding is consistent with clinical experience which shows about 60% to 80% of accidents are related to traffic, followed by accidents occurred in work place. PMID:26815039
Multivariate quantile mapping bias correction: an N-dimensional probability density function transform for climate model simulations of multiple variables

NASA Astrophysics Data System (ADS)

Cannon, Alex J.

2018-01-01

Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.
Partial Least Squares Calibration Modeling Towards the Multivariate Limit of Detection for Enriched Isotopic Mixtures via Laser Ablation Molecular Isotopic Spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harris, Candace; Profeta, Luisa; Akpovo, Codjo

The psuedo univariate limit of detection was calculated to compare to the multivariate interval. ompared with results from the psuedounivariate LOD, the multivariate LOD includes other factors (i.e. signal uncertainties) and the reveals the significance in creating models that not only use the analyte’s emission line but also its entire molecular spectra.
Multiple imputation for handling missing outcome data when estimating the relative risk.

PubMed

Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B

2017-09-06

Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Piecewise multivariate modelling of sequential metabolic profiling data.

PubMed

Rantalainen, Mattias; Cloarec, Olivier; Ebbels, Timothy M D; Lundstedt, Torbjörn; Nicholson, Jeremy K; Holmes, Elaine; Trygg, Johan

2008-02-19

Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints. A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted. The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.
Analysis of algae growth mechanism and water bloom prediction under the effect of multi-affecting factor.

PubMed

Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin

2017-03-01

The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Multivariate curve-resolution analysis of pesticides in water samples from liquid chromatographic-diode array data.

PubMed

Maggio, Rubén M; Damiani, Patricia C; Olivieri, Alejandro C

2011-01-30

Liquid chromatographic-diode array detection data recorded for aqueous mixtures of 11 pesticides show the combined presence of strongly coeluting peaks, distortions in the time dimension between experimental runs, and the presence of potential interferents not modeled by the calibration phase in certain test samples. Due to the complexity of these phenomena, data were processed by a second-order multivariate algorithm based on multivariate curve resolution and alternating least-squares, which allows one to successfully model both the spectral and retention time behavior for all sample constituents. This led to the accurate quantitation of all analytes in a set of validation samples: aldicarb sulfoxide, oxamyl, aldicarb sulfone, methomyl, 3-hydroxy-carbofuran, aldicarb, propoxur, carbofuran, carbaryl, 1-naphthol and methiocarb. Limits of detection in the range 0.1-2 μg mL(-1) were obtained. Additionally, the second-order advantage for several analytes was achieved in samples containing several uncalibrated interferences. The limits of detection for all analytes were decreased by solid phase pre-concentration to values compatible to those officially recommended, i.e., in the order of 5 ng mL(-1). Copyright Â© 2010 Elsevier B.V. All rights reserved.
A tridiagonal parsimonious higher order multivariate Markov chain model

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, we present a tridiagonal parsimonious higher-order multivariate Markov chain model (TPHOMMCM). Moreover, estimation method of the parameters in TPHOMMCM is give. Numerical experiments illustrate the effectiveness of TPHOMMCM.
MULTIVARIATE LINEAR MIXED MODELS FOR MULTIPLE OUTCOMES. (R824757)

EPA Science Inventory

We propose a multivariate linear mixed (MLMM) for the analysis of multiple outcomes, which generalizes the latent variable model of Sammel and Ryan. The proposed model assumes a flexible correlation structure among the multiple outcomes, and allows a global test of the impact of ...
Determination of main fruits in adulterated nectars by ATR-FTIR spectroscopy combined with multivariate calibration and variable selection methods.

PubMed

Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho

2018-07-15

Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
[A novel method of multi-channel feature extraction combining multivariate autoregression and multiple-linear principal component analysis].

PubMed

Wang, Jinjia; Zhang, Yanna

2015-02-01

Brain-computer interface (BCI) systems identify brain signals through extracting features from them. In view of the limitations of the autoregressive model feature extraction method and the traditional principal component analysis to deal with the multichannel signals, this paper presents a multichannel feature extraction method that multivariate autoregressive (MVAR) model combined with the multiple-linear principal component analysis (MPCA), and used for magnetoencephalography (MEG) signals and electroencephalograph (EEG) signals recognition. Firstly, we calculated the MVAR model coefficient matrix of the MEG/EEG signals using this method, and then reduced the dimensions to a lower one, using MPCA. Finally, we recognized brain signals by Bayes Classifier. The key innovation we introduced in our investigation showed that we extended the traditional single-channel feature extraction method to the case of multi-channel one. We then carried out the experiments using the data groups of IV-III and IV - I. The experimental results proved that the method proposed in this paper was feasible.
Multivariate normative comparisons using an aggregated database

PubMed Central

Murre, Jaap M. J.; Huizenga, Hilde M.

2017-01-01

In multivariate normative comparisons, a patient’s profile of test scores is compared to those in a normative sample. Recently, it has been shown that these multivariate normative comparisons enhance the sensitivity of neuropsychological assessment. However, multivariate normative comparisons require multivariate normative data, which are often unavailable. In this paper, we show how a multivariate normative database can be constructed by combining healthy control group data from published neuropsychological studies. We show that three issues should be addressed to construct a multivariate normative database. First, the database may have a multilevel structure, with participants nested within studies. Second, not all tests are administered in every study, so many data may be missing. Third, a patient should be compared to controls of similar age, gender and educational background rather than to the entire normative sample. To address these issues, we propose a multilevel approach for multivariate normative comparisons that accounts for missing data and includes covariates for age, gender and educational background. Simulations show that this approach controls the number of false positives and has high sensitivity to detect genuine deviations from the norm. An empirical example is provided. Implications for other domains than neuropsychology are also discussed. To facilitate broader adoption of these methods, we provide code implementing the entire analysis in the open source software package R. PMID:28267796
Comparing Within-Person Effects from Multivariate Longitudinal Models

ERIC Educational Resources Information Center

Bainter, Sierra A.; Howard, Andrea L.

2016-01-01

Several multivariate models are motivated to answer similar developmental questions regarding within-person (intraindividual) effects between 2 or more constructs over time, yet the within-person effects tested by each model are distinct. In this article, the authors clarify the types of within-person inferences that can be made from each model.…
Circulating miRNAs in Pediatric Pulmonary Hypertension Show Promise as Biomarkers of Vascular Function

PubMed Central

Sucharov, Carmen C.; Truong, Uyen; Dunning, Jamie; Ivy, Dunbar; Miyamoto, Shelley; Shandas, Robin

2017-01-01

Background/Objectives The objective of this study was to evaluate the utility of circulating miRNAs as biomarkers of vascular function in pediatric pulmonary hypertension. Method Fourteen pediatric pulmonary arterial hypertension patients underwent simultaneous right heart catheterization (RHC) and blood biochemical analysis. Univariate and stepwise multivariate linear regression was used to identify and correlate measures of reactive and resistive afterload with circulating miRNA levels. Furthermore, circulating miRNA candidates that classified patients according to a 20% decrease in resistive afterload in response to oxygen (O2) or inhaled nitric oxide (iNO) were identified using receiver-operating curves. Results Thirty-two circulating miRNAs correlated with the pulmonary vascular resistance index (PVRi), pulmonary arterial distensibility, and PVRi decrease in response to O2 and/or iNO. Multivariate models, combining the predictive capability of multiple promising miRNA candidates, revealed a good correlation with resistive (r = 0.97, P2−tailed < 0.0001) and reactive (r = 0.86, P2−tailed < 0.005) afterloads. Bland-Altman plots showed that 95% of the differences between multivariate models and RHC would fall within 0.13 (mmHg−min/L)m2 and 0.0085/mmHg for resistive and reactive afterloads, respectively. Circulating miR-663 proved to be a good classifier for vascular responsiveness to acute O2 and iNO challenges. Conclusion This study suggests that circulating miRNAs may be biomarkers to phenotype vascular function in pediatric PAH. PMID:28819545
Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

PubMed Central

Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

2014-01-01

In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061
A Bayesian joint probability modeling approach for seasonal forecasting of streamflows at multiple sites

NASA Astrophysics Data System (ADS)

Wang, Q. J.; Robertson, D. E.; Chiew, F. H. S.

2009-05-01

Seasonal forecasting of streamflows can be highly valuable for water resources management. In this paper, a Bayesian joint probability (BJP) modeling approach for seasonal forecasting of streamflows at multiple sites is presented. A Box-Cox transformed multivariate normal distribution is proposed to model the joint distribution of future streamflows and their predictors such as antecedent streamflows and El Niño-Southern Oscillation indices and other climate indicators. Bayesian inference of model parameters and uncertainties is implemented using Markov chain Monte Carlo sampling, leading to joint probabilistic forecasts of streamflows at multiple sites. The model provides a parametric structure for quantifying relationships between variables, including intersite correlations. The Box-Cox transformed multivariate normal distribution has considerable flexibility for modeling a wide range of predictors and predictands. The Bayesian inference formulated allows the use of data that contain nonconcurrent and missing records. The model flexibility and data-handling ability means that the BJP modeling approach is potentially of wide practical application. The paper also presents a number of statistical measures and graphical methods for verification of probabilistic forecasts of continuous variables. Results for streamflows at three river gauges in the Murrumbidgee River catchment in southeast Australia show that the BJP modeling approach has good forecast quality and that the fitted model is consistent with observed data.
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging

PubMed Central

Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos

2015-01-01

Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

NASA Astrophysics Data System (ADS)

Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

2018-03-01

This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

Quantifying the Value of Downscaled Climate Model Information for Adaptation Decisions: When is Downscaling a Smart Decision?

NASA Astrophysics Data System (ADS)

Terando, A. J.; Wootten, A.; Eaton, M. J.; Runge, M. C.; Littell, J. S.; Bryan, A. M.; Carter, S. L.

2015-12-01

Two types of decisions face society with respect to anthropogenic climate change: (1) whether to enact a global greenhouse gas abatement policy, and (2) how to adapt to the local consequences of current and future climatic changes. The practice of downscaling global climate models (GCMs) is often used to address (2) because GCMs do not resolve key features that will mediate global climate change at the local scale. In response, the development of downscaling techniques and models has accelerated to aid decision makers seeking adaptation guidance. However, quantifiable estimates of the value of information are difficult to obtain, particularly in decision contexts characterized by deep uncertainty and low system-controllability. Here we demonstrate a method to quantify the additional value that decision makers could expect if research investments are directed towards developing new downscaled climate projections. As a proof of concept we focus on a real-world management problem: whether to undertake assisted migration for an endangered tropical avian species. We also take advantage of recently published multivariate methods that account for three vexing issues in climate impacts modeling: maximizing climate model quality information, accounting for model dependence in ensembles of opportunity, and deriving probabilistic projections. We expand on these global methods by including regional (Caribbean Basin) and local (Puerto Rico) domains. In the local domain, we test whether a high resolution (2km) dynamically downscaled GCM reduces the multivariate error estimate compared to the original coarse-scale GCM. Initial tests show little difference between the downscaled and original GCM multivariate error. When propagated through to a species population model, the Value of Information analysis indicates that the expected utility that would accrue to the manager (and species) if this downscaling were completed may not justify the cost compared to alternative actions.
Meta-Analytic Structural Equation Modeling (MASEM): Comparison of the Multivariate Methods

ERIC Educational Resources Information Center

Zhang, Ying

2011-01-01

Meta-analytic Structural Equation Modeling (MASEM) has drawn interest from many researchers recently. In doing MASEM, researchers usually first synthesize correlation matrices across studies using meta-analysis techniques and then analyze the pooled correlation matrix using structural equation modeling techniques. Several multivariate methods of…
MULTIVARIATE RECEPTOR MODELS-CURRENT PRACTICE AND FUTURE TRENDS. (R826238)

EPA Science Inventory

Multivariate receptor models have been applied to the analysis of air quality data for sometime. However, solving the general mixture problem is important in several other fields. This paper looks at the panoply of these models with a view of identifying common challenges and ...
Regional vertical total electron content (VTEC) modeling together with satellite and receiver differential code biases (DCBs) using semi-parametric multivariate adaptive regression B-splines (SP-BMARS)

NASA Astrophysics Data System (ADS)

Durmaz, Murat; Karslioglu, Mahmut Onur

2015-04-01

There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Regression Models For Multivariate Count Data

PubMed Central

Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

2016-01-01

Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. PMID:28348500
Regression Models For Multivariate Count Data.

PubMed

Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

2017-01-01

Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.
Work and retirement among a cohort of older men in the United States, 1966-1983.

PubMed

Hayward, M D; Grady, W R

1990-08-01

Multivariate increment-decrement working life tables are estimated for a cohort of older men in the United States for the period 1966-1983. The approach taken allows multiple processes to be simultaneously incorporated into a single model, resulting in a more realistic portrayal of a cohort's late-life labor force behavior. In addition, because the life table model is developed from multivariate hazard equations, we identify the effects of sociodemographic characteristics on the potentially complex process by which the labor force career is ended. In contrast to the assumed homogeneity of previous working life table analyses, the present study shows marked differences in labor force mobility and working and nonworking life expectancy according to occupation, class of worker, education, race, and marital status. We briefly discuss the implications of these findings for inequities of access to retirement, private and public pension consumption, and future changes in the retirement process.
Elliptic supersymmetric integrable model and multivariable elliptic functions

NASA Astrophysics Data System (ADS)

Motegi, Kohei

2017-12-01

We investigate the elliptic integrable model introduced by Deguchi and Martin [Int. J. Mod. Phys. A 7, Suppl. 1A, 165 (1992)], which is an elliptic extension of the Perk-Schultz model. We introduce and study a class of partition functions of the elliptic model by using the Izergin-Korepin analysis. We show that the partition functions are expressed as a product of elliptic factors and elliptic Schur-type symmetric functions. This result resembles recent work by number theorists in which the correspondence between the partition functions of trigonometric models and the product of the deformed Vandermonde determinant and Schur functions were established.
Estimation of value at risk in currency exchange rate portfolio using asymmetric GJR-GARCH Copula

NASA Astrophysics Data System (ADS)

Nurrahmat, Mohamad Husein; Noviyanti, Lienda; Bachrudin, Achmad

2017-03-01

In this study, we discuss the problem in measuring the risk in a portfolio based on value at risk (VaR) using asymmetric GJR-GARCH Copula. The approach based on the consideration that the assumption of normality over time for the return can not be fulfilled, and there is non-linear correlation for dependent model structure among the variables that lead to the estimated VaR be inaccurate. Moreover, the leverage effect also causes the asymmetric effect of dynamic variance and shows the weakness of the GARCH models due to its symmetrical effect on conditional variance. Asymmetric GJR-GARCH models are used to filter the margins while the Copulas are used to link them together into a multivariate distribution. Then, we use copulas to construct flexible multivariate distributions with different marginal and dependence structure, which is led to portfolio joint distribution does not depend on the assumptions of normality and linear correlation. VaR obtained by the analysis with confidence level 95% is 0.005586. This VaR derived from the best Copula model, t-student Copula with marginal distribution of t distribution.
Obstructive urination problems after high-dose-rate brachytherapy boost treatment for prostate cancer are avoidable.

PubMed

Kragelj, Borut

2016-03-01

Aiming at improving treatment individualization in patients with prostate cancer treated with combination of external beam radiotherapy and high-dose-rate brachytherapy to boost the dose to prostate (HDRB-B), the objective was to evaluate factors that have potential impact on obstructive urination problems (OUP) after HDRB-B. In the follow-up study 88 patients consecutively treated with HDRB-B at the Institute of Oncology Ljubljana in the period 2006-2011 were included. The observed outcome was deterioration of OUP (DOUP) during the follow-up period longer than 1 year. Univariate and multivariate relationship analysis between DOUP and potential risk factors (treatment factors, patients' characteristics) was carried out by using binary logistic regression. ROC curve was constructed on predicted values and the area under the curve (AUC) calculated to assess the performance of the multivariate model. Analysis was carried out on 71 patients who completed 3 years of follow-up. DOUP was noted in 13/71 (18.3%) of them. The results of multivariate analysis showed statistically significant relationship between DOUP and anti-coagulation treatment (OR 4.86, 95% C.I. limits: 1.21-19.61, p = 0.026). Also minimal dose received by 90% of the urethra volume was close to statistical significance (OR = 1.23; 95% C.I. limits: 0.98-1.07, p = 0.099). The value of AUC was 0.755. The study emphasized the relationship between DOUP and anticoagulation treatment, and suggested the multivariate model with fair predictive performance. This model potentially enables a reduction of DOUP after HDRB-B. It supports the belief that further research should be focused on urethral sphincter as a critical structure for OUP.
A "Model" Multivariable Calculus Course.

ERIC Educational Resources Information Center

Beckmann, Charlene E.; Schlicker, Steven J.

1999-01-01

Describes a rich, investigative approach to multivariable calculus. Introduces a project in which students construct physical models of surfaces that represent real-life applications of their choice. The models, along with student-selected datasets, serve as vehicles to study most of the concepts of the course from both continuous and discrete…
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

ERIC Educational Resources Information Center

Culpepper, Steven Andrew; Park, Trevor

2017-01-01

A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
Part 2. Development of Enhanced Statistical Methods for Assessing Health Effects Associated with an Unknown Number of Major Sources of Multiple Air Pollutants.

PubMed

Park, Eun Sug; Symanski, Elaine; Han, Daikwon; Spiegelman, Clifford

2015-06-01

A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.
A Sandwich-Type Standard Error Estimator of SEM Models with Multivariate Time Series

ERIC Educational Resources Information Center

Zhang, Guangjian; Chow, Sy-Miin; Ong, Anthony D.

2011-01-01

Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a…
Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains

PubMed Central

Krumin, Michael; Shoham, Shy

2010-01-01

Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705
A joint modeling and estimation method for multivariate longitudinal data with mixed types of responses to analyze physical activity data generated by accelerometers.

PubMed

Li, Haocheng; Zhang, Yukun; Carroll, Raymond J; Keadle, Sarah Kozey; Sampson, Joshua N; Matthews, Charles E

2017-11-10

A mixed effect model is proposed to jointly analyze multivariate longitudinal data with continuous, proportion, count, and binary responses. The association of the variables is modeled through the correlation of random effects. We use a quasi-likelihood type approximation for nonlinear variables and transform the proposed model into a multivariate linear mixed model framework for estimation and inference. Via an extension to the EM approach, an efficient algorithm is developed to fit the model. The method is applied to physical activity data, which uses a wearable accelerometer device to measure daily movement and energy expenditure information. Our approach is also evaluated by a simulation study. Copyright © 2017 John Wiley & Sons, Ltd.
Development and evaluation of height diameter at breast models for native Chinese Metasequoia.

PubMed

Liu, Mu; Feng, Zhongke; Zhang, Zhixiang; Ma, Chenghui; Wang, Mingming; Lian, Bo-Ling; Sun, Renjie; Zhang, Li

2017-01-01

Accurate tree height and diameter at breast height (dbh) are important input variables for growth and yield models. A total of 5503 Chinese Metasequoia trees were used in this study. We studied 53 fitted models, of which 7 were linear models and 46 were non-linear models. These models were divided into two groups of single models and multivariate models according to the number of independent variables. The results show that the allometry equation of tree height which has diameter at breast height as independent variable can better reflect the change of tree height; in addition the prediction accuracy of the multivariate composite models is higher than that of the single variable models. Although tree age is not the most important variable in the study of the relationship between tree height and dbh, the consideration of tree age when choosing models and parameters in model selection can make the prediction of tree height more accurate. The amount of data is also an important parameter what can improve the reliability of models. Other variables such as tree height, main dbh and altitude, etc can also affect models. In this study, the method of developing the recommended models for predicting the tree height of native Metasequoias aged 50-485 years is statistically reliable and can be used for reference in predicting the growth and production of mature native Metasequoia.
Development and evaluation of height diameter at breast models for native Chinese Metasequoia

PubMed Central

Feng, Zhongke; Zhang, Zhixiang; Ma, Chenghui; Wang, Mingming; Lian, Bo-ling; Sun, Renjie; Zhang, Li

2017-01-01

Accurate tree height and diameter at breast height (dbh) are important input variables for growth and yield models. A total of 5503 Chinese Metasequoia trees were used in this study. We studied 53 fitted models, of which 7 were linear models and 46 were non-linear models. These models were divided into two groups of single models and multivariate models according to the number of independent variables. The results show that the allometry equation of tree height which has diameter at breast height as independent variable can better reflect the change of tree height; in addition the prediction accuracy of the multivariate composite models is higher than that of the single variable models. Although tree age is not the most important variable in the study of the relationship between tree height and dbh, the consideration of tree age when choosing models and parameters in model selection can make the prediction of tree height more accurate. The amount of data is also an important parameter what can improve the reliability of models. Other variables such as tree height, main dbh and altitude, etc can also affect models. In this study, the method of developing the recommended models for predicting the tree height of native Metasequoias aged 50–485 years is statistically reliable and can be used for reference in predicting the growth and production of mature native Metasequoia. PMID:28817600
Patterns and Predictors of Language and Literacy Abilities 4-10 Years in the Longitudinal Study of Australian Children

PubMed Central

Zubrick, Stephen R.; Taylor, Catherine L.; Christensen, Daniel

2015-01-01

Aims Oral language is the foundation of literacy. Naturally, policies and practices to promote children’s literacy begin in early childhood and have a strong focus on developing children’s oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children’s progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children’s oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children’s progress along the oral to literate continuum is stable and predictable. Findings Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years. PMID:26352436
Load compensation in a lean burn natural gas vehicle

NASA Astrophysics Data System (ADS)

Gangopadhyay, Anupam

A new multivariable PI tuning technique is developed in this research that is primarily developed for regulation purposes. Design guidelines are developed based on closed-loop stability. The new multivariable design is applied in a natural gas vehicle to combine idle and A/F ratio control loops. This results in better recovery during low idle operation of a vehicle under external step torques. A powertrain model of a natural gas engine is developed and validated for steady-state and transient operation. The nonlinear model has three states: engine speed, intake manifold pressure and fuel fraction in the intake manifold. The model includes the effect of fuel partial pressure in the intake manifold filling and emptying dynamics. Due to the inclusion of fuel fraction as a state, fuel flow rate into the cylinders is also accurately modeled. A linear system identification is performed on the nonlinear model. The linear model structure is predicted analytically from the nonlinear model and the coefficients of the predicted transfer function are shown to be functions of key physical parameters in the plant. Simulations of linear system and model parameter identification is shown to converge to the predicted values of the model coefficients. The multivariable controller developed in this research could be designed in an algebraic fashion once the plant model is known. It is thus possible to implement the multivariable PI design in an adaptive fashion combining the controller with identified plant model on-line. This will result in a self-tuning regulator (STR) type controller where the underlying design criteria is the multivariable tuning technique designed in this research.

An efficient parallel sampling technique for Multivariate Poisson-Lognormal model: Analysis with two crash count datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhan, Xianyuan; Aziz, H. M. Abdul; Ukkusuri, Satish V.

Our study investigates the Multivariate Poisson-lognormal (MVPLN) model that jointly models crash frequency and severity accounting for correlations. The ordinary univariate count models analyze crashes of different severity level separately ignoring the correlations among severity levels. The MVPLN model is capable to incorporate the general correlation structure and takes account of the over dispersion in the data that leads to a superior data fitting. But, the traditional estimation approach for MVPLN model is computationally expensive, which often limits the use of MVPLN model in practice. In this work, a parallel sampling scheme is introduced to improve the original Markov Chainmore » Monte Carlo (MCMC) estimation approach of the MVPLN model, which significantly reduces the model estimation time. Two MVPLN models are developed using the pedestrian vehicle crash data collected in New York City from 2002 to 2006, and the highway-injury data from Washington State (5-year data from 1990 to 1994) The Deviance Information Criteria (DIC) is used to evaluate the model fitting. The estimation results show that the MVPLN models provide a superior fit over univariate Poisson-lognormal (PLN), univariate Poisson, and Negative Binomial models. Moreover, the correlations among the latent effects of different severity levels are found significant in both datasets that justifies the importance of jointly modeling crash frequency and severity accounting for correlations.« less
An efficient parallel sampling technique for Multivariate Poisson-Lognormal model: Analysis with two crash count datasets

DOE PAGES

Zhan, Xianyuan; Aziz, H. M. Abdul; Ukkusuri, Satish V.

2015-11-19

Our study investigates the Multivariate Poisson-lognormal (MVPLN) model that jointly models crash frequency and severity accounting for correlations. The ordinary univariate count models analyze crashes of different severity level separately ignoring the correlations among severity levels. The MVPLN model is capable to incorporate the general correlation structure and takes account of the over dispersion in the data that leads to a superior data fitting. But, the traditional estimation approach for MVPLN model is computationally expensive, which often limits the use of MVPLN model in practice. In this work, a parallel sampling scheme is introduced to improve the original Markov Chainmore » Monte Carlo (MCMC) estimation approach of the MVPLN model, which significantly reduces the model estimation time. Two MVPLN models are developed using the pedestrian vehicle crash data collected in New York City from 2002 to 2006, and the highway-injury data from Washington State (5-year data from 1990 to 1994) The Deviance Information Criteria (DIC) is used to evaluate the model fitting. The estimation results show that the MVPLN models provide a superior fit over univariate Poisson-lognormal (PLN), univariate Poisson, and Negative Binomial models. Moreover, the correlations among the latent effects of different severity levels are found significant in both datasets that justifies the importance of jointly modeling crash frequency and severity accounting for correlations.« less
Practical robustness measures in multivariable control system analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Lehtomaki, N. A.

1981-01-01

The robustness of the stability of multivariable linear time invariant feedback control systems with respect to model uncertainty is considered using frequency domain criteria. Available robustness tests are unified under a common framework based on the nature and structure of model errors. These results are derived using a multivariable version of Nyquist's stability theorem in which the minimum singular value of the return difference transfer matrix is shown to be the multivariable generalization of the distance to the critical point on a single input, single output Nyquist diagram. Using the return difference transfer matrix, a very general robustness theorem is presented from which all of the robustness tests dealing with specific model errors may be derived. The robustness tests that explicitly utilized model error structure are able to guarantee feedback system stability in the face of model errors of larger magnitude than those robustness tests that do not. The robustness of linear quadratic Gaussian control systems are analyzed.
Prevalence and Determinants of Suicide Ideation among Lebanese Adolescents: Results of the GSHS Lebanon 2005

ERIC Educational Resources Information Center

Mahfoud, Ziyad R.; Afifi, Rema A.; Haddad, Pascale H.; DeJong, Jocelyn

2011-01-01

The current study examined prevalence and risk factors for suicide ideation in 5038 Lebanese adolescents using Global School Health Survey data. Around 16% of Lebanese adolescents thought of suicide. Multivariate logistic regression models showed that risk factors for suicide ideation included poor mental health (felt lonely, felt worried, felt…
The Roles of Victim and Offender Substance Use in Sexual Assault Outcomes

ERIC Educational Resources Information Center

Brecklin, Leanne R.; Ullman, Sarah E.

2010-01-01

The impact of victim and offender preassault substance use on the outcomes of sexual assault incidents was analyzed. Nine hundred and seventy female sexual assault victims were identified from the first wave of a longitudinal study based on a convenience sampling strategy. Multivariate models showed that victim injury was more likely in assaults…
A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression

PubMed Central

Jackson, Dan; White, Ian R; Riley, Richard D

2013-01-01

Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
Multivariate Phylogenetic Comparative Methods: Evaluations, Comparisons, and Recommendations.

PubMed

Adams, Dean C; Collyer, Michael L

2018-01-01

Recent years have seen increased interest in phylogenetic comparative analyses of multivariate data sets, but to date the varied proposed approaches have not been extensively examined. Here we review the mathematical properties required of any multivariate method, and specifically evaluate existing multivariate phylogenetic comparative methods in this context. Phylogenetic comparative methods based on the full multivariate likelihood are robust to levels of covariation among trait dimensions and are insensitive to the orientation of the data set, but display increasing model misspecification as the number of trait dimensions increases. This is because the expected evolutionary covariance matrix (V) used in the likelihood calculations becomes more ill-conditioned as trait dimensionality increases, and as evolutionary models become more complex. Thus, these approaches are only appropriate for data sets with few traits and many species. Methods that summarize patterns across trait dimensions treated separately (e.g., SURFACE) incorrectly assume independence among trait dimensions, resulting in nearly a 100% model misspecification rate. Methods using pairwise composite likelihood are highly sensitive to levels of trait covariation, the orientation of the data set, and the number of trait dimensions. The consequences of these debilitating deficiencies are that a user can arrive at differing statistical conclusions, and therefore biological inferences, simply from a dataspace rotation, like principal component analysis. By contrast, algebraic generalizations of the standard phylogenetic comparative toolkit that use the trace of covariance matrices are insensitive to levels of trait covariation, the number of trait dimensions, and the orientation of the data set. Further, when appropriate permutation tests are used, these approaches display acceptable Type I error and statistical power. We conclude that methods summarizing information across trait dimensions, as well as pairwise composite likelihood methods should be avoided, whereas algebraic generalizations of the phylogenetic comparative toolkit provide a useful means of assessing macroevolutionary patterns in multivariate data. Finally, we discuss areas in which multivariate phylogenetic comparative methods are still in need of future development; namely highly multivariate Ornstein-Uhlenbeck models and approaches for multivariate evolutionary model comparisons. © The Author(s) 2017. Published by Oxford University Press on behalf of the Systematic Biology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Advanced Multivariate Inversion Techniques for High Resolution 3D Geophysical Modeling

DTIC Science & Technology

2011-09-01

of seismic ambient noise – has been used to image crustal Vs variation with a lateral resolution upward of 100 km either on regional or on sub...to East Africa, we solve for velocity structure in an area with less lateral heterogeneity but great tectonic complexity. To increase the...demonstrate correlation with crustal geology. Figure 1 shows the 3D S-wave velocity model obtained from the joint inversion. The low-velocity anomaly
Decision-support models for empiric antibiotic selection in Gram-negative bloodstream infections.

PubMed

MacFadden, D R; Coburn, B; Shah, N; Robicsek, A; Savage, R; Elligsen, M; Daneman, N

2018-04-25

Early empiric antibiotic therapy in patients can improve clinical outcomes in Gram-negative bacteraemia. However, the widespread prevalence of antibiotic-resistant pathogens compromises our ability to provide adequate therapy while minimizing use of broad antibiotics. We sought to determine whether readily available electronic medical record data could be used to develop predictive models for decision support in Gram-negative bacteraemia. We performed a multi-centre cohort study, in Canada and the USA, of hospitalized patients with Gram-negative bloodstream infection from April 2010 to March 2015. We analysed multivariable models for prediction of antibiotic susceptibility at two empiric windows: Gram-stain-guided and pathogen-guided treatment. Decision-support models for empiric antibiotic selection were developed based on three clinical decision thresholds of acceptable adequate coverage (80%, 90% and 95%). A total of 1832 patients with Gram-negative bacteraemia were evaluated. Multivariable models showed good discrimination across countries and at both Gram-stain-guided (12 models, areas under the curve (AUCs) 0.68-0.89, optimism-corrected AUCs 0.63-0.85) and pathogen-guided (12 models, AUCs 0.75-0.98, optimism-corrected AUCs 0.64-0.95) windows. Compared to antibiogram-guided therapy, decision-support models of antibiotic selection incorporating individual patient characteristics and prior culture results have the potential to increase use of narrower-spectrum antibiotics (in up to 78% of patients) while reducing inadequate therapy. Multivariable models using readily available epidemiologic factors can be used to predict antimicrobial susceptibility in infecting pathogens with reasonable discriminatory ability. Implementation of sequential predictive models for real-time individualized empiric antibiotic decision-making has the potential to both optimize adequate coverage for patients while minimizing overuse of broad-spectrum antibiotics, and therefore requires further prospective evaluation. Readily available epidemiologic risk factors can be used to predict susceptibility of Gram-negative organisms among patients with bacteraemia, using automated decision-making models. Copyright © 2018 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Statistical mixture design and multivariate analysis of inkjet printed a-WO3/TiO2/WOX electrochromic films.

PubMed

Wojcik, Pawel Jerzy; Pereira, Luís; Martins, Rodrigo; Fortunato, Elvira

2014-01-13

An efficient mathematical strategy in the field of solution processed electrochromic (EC) films is outlined as a combination of an experimental work, modeling, and information extraction from massive computational data via statistical software. Design of Experiment (DOE) was used for statistical multivariate analysis and prediction of mixtures through a multiple regression model, as well as the optimization of a five-component sol-gel precursor subjected to complex constraints. This approach significantly reduces the number of experiments to be realized, from 162 in the full factorial (L=3) and 72 in the extreme vertices (D=2) approach down to only 30 runs, while still maintaining a high accuracy of the analysis. By carrying out a finite number of experiments, the empirical modeling in this study shows reasonably good prediction ability in terms of the overall EC performance. An optimized ink formulation was employed in a prototype of a passive EC matrix fabricated in order to test and trial this optically active material system together with a solid-state electrolyte for the prospective application in EC displays. Coupling of DOE with chromogenic material formulation shows the potential to maximize the capabilities of these systems and ensures increased productivity in many potential solution-processed electrochemical applications.
The lexical development of children with hearing impairment and associated factors.

PubMed

Penna, Leticia Macedo; Lemos, Stela Maris Aguiar; Alves, Cláudia Regina Lindgren

2014-01-01

This study aimed at analyzing the association between the lexical development of children with hearing impairment and their psychosocial and socioeconomic characteristics and medical history. An analytic transversal study was conducted in an Auditive Health Attention Service. One hundred and ten children from 6 to 10 years old using hearing aids and presenting hearing loss that ranged from light to deep levels were evaluated. All children were subjected to oral, written language and auditory perception tests. Parents answered a structured questionnaire to collect data from their medical history and socioeconomic status, and questionnaires about the features of the family environment and psychosocial characteristics. Multivariate analysis was performed by logistic regression, being the initial model composed by variables with p<0,20 in the univariate analysis. In the final model, we adopted a significance level of 5%. The final model of the multivariate analysis showed an association between the performance on the vocabulary test and the results of phonemic discrimination test (OR=0.81; 95%CI 0.73-0.89). The results show the importance of stimulating the auditory processing, particularly the phonemic discrimination skill, throughout the rehabilitation process of children with hearing impairment. This stimulation can enhance lexical development and minimize the metalanguage and learning difficulties often observed in these children.
Describing the Elephant: Structure and Function in Multivariate Data.

ERIC Educational Resources Information Center

McDonald, Roderick P.

1986-01-01

There is a unity underlying the diversity of models for the analysis of multivariate data. Essentially, they constitute a family of models, most generally nonlinear, for structural/functional relations between variables drawn from a behavior domain. (Author)
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert M.

2013-01-01

A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.
Uni- and multi-variable modelling of flood losses: experiences gained from the Secchia river inundation event.

NASA Astrophysics Data System (ADS)

Carisi, Francesca; Domeneghetti, Alessio; Kreibich, Heidi; Schröter, Kai; Castellarin, Attilio

2017-04-01

Flood risk is function of flood hazard and vulnerability, therefore its accurate assessment depends on a reliable quantification of both factors. The scientific literature proposes a number of objective and reliable methods for assessing flood hazard, yet it highlights a limited understanding of the fundamental damage processes. Loss modelling is associated with large uncertainty which is, among other factors, due to a lack of standard procedures; for instance, flood losses are often estimated based on damage models derived in completely different contexts (i.e. different countries or geographical regions) without checking its applicability, or by considering only one explanatory variable (i.e. typically water depth). We consider the Secchia river flood event of January 2014, when a sudden levee-breach caused the inundation of nearly 200 km2 in Northern Italy. In the aftermath of this event, local authorities collected flood loss data, together with additional information on affected private households and industrial activities (e.g. buildings surface and economic value, number of company's employees and others). Based on these data we implemented and compared a quadratic-regression damage function, with water depth as the only explanatory variable, and a multi-variable model that combines multiple regression trees and considers several explanatory variables (i.e. bagging decision trees). Our results show the importance of data collection revealing that (1) a simple quadratic regression damage function based on empirical data from the study area can be significantly more accurate than literature damage-models derived for a different context and (2) multi-variable modelling may outperform the uni-variable approach, yet it is more difficult to develop and apply due to a much higher demand of detailed data.
Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models.

PubMed

Chen, Zewei; Zhang, Xin; Zhang, Zhuoyong

2016-12-01

Timely risk assessment of chronic kidney disease (CKD) and proper community-based CKD monitoring are important to prevent patients with potential risk from further kidney injuries. As many symptoms are associated with the progressive development of CKD, evaluating risk of CKD through a set of clinical data of symptoms coupled with multivariate models can be considered as an available method for prevention of CKD and would be useful for community-based CKD monitoring. Three common used multivariate models, i.e., K-nearest neighbor (KNN), support vector machine (SVM), and soft independent modeling of class analogy (SIMCA), were used to evaluate risk of 386 patients based on a series of clinical data taken from UCI machine learning repository. Different types of composite data, in which proportional disturbances were added to simulate measurement deviations caused by environment and instrument noises, were also utilized to evaluate the feasibility and robustness of these models in risk assessment of CKD. For the original data set, three mentioned multivariate models can differentiate patients with CKD and non-CKD with the overall accuracies over 93 %. KNN and SVM have better performances than SIMCA has in this study. For the composite data set, SVM model has the best ability to tolerate noise disturbance and thus are more robust than the other two models. Using clinical data set on symptoms coupled with multivariate models has been proved to be feasible approach for assessment of patient with potential CKD risk. SVM model can be used as useful and robust tool in this study.
Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.

PubMed

Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs

2009-02-01

This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.
Multivariate regression model for predicting lumber grade volumes of northern red oak sawlogs

Treesearch

Daniel A. Yaussy; Robert L. Brisbin

1983-01-01

A multivariate regression model was developed to predict green board-foot yields for the seven common factory lumber grades processed from northern red oak (Quercus rubra L.) factory grade logs. The model uses the standard log measurements of grade, scaling diameter, length, and percent defect. It was validated with an independent data set. The model...
A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

DTIC Science & Technology

2017-09-01

efficacy of statistical post-processing methods downstream of these dynamical model components with a hierarchical multivariate Bayesian approach to...Bayesian hierarchical modeling, Markov chain Monte Carlo methods , Metropolis algorithm, machine learning, atmospheric prediction 15. NUMBER OF PAGES...scale processes. However, this dissertation explores the efficacy of statistical post-processing methods downstream of these dynamical model components
Predictive and mechanistic multivariate linear regression models for reaction development

PubMed Central

Santiago, Celine B.; Guo, Jing-Yao

2018-01-01

Multivariate Linear Regression (MLR) models utilizing computationally-derived and empirically-derived physical organic molecular descriptors are described in this review. Several reports demonstrating the effectiveness of this methodological approach towards reaction optimization and mechanistic interrogation are discussed. A detailed protocol to access quantitative and predictive MLR models is provided as a guide for model development and parameter analysis. PMID:29719711
Linear regression analysis and its application to multivariate chromatographic calibration for the quantitative analysis of two-component mixtures.

PubMed

Dinç, Erdal; Ozdemir, Abdil

2005-01-01

Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.

Power of Models in Longitudinal Study: Findings from a Full-Crossed Simulation Design

ERIC Educational Resources Information Center

Fang, Hua; Brooks, Gordon P.; Rizzo, Maria L.; Espy, Kimberly Andrews; Barcikowski, Robert S.

2009-01-01

Because the power properties of traditional repeated measures and hierarchical multivariate linear models have not been clearly determined in the balanced design for longitudinal studies in the literature, the authors present a power comparison study of traditional repeated measures and hierarchical multivariate linear models under 3…
Species distribution modelling for plant communities: Stacked single species or multivariate modelling approaches?

Treesearch

Emilie B. Henderson; Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Harold S.J. Zald

2014-01-01

Landscape management and conservation planning require maps of vegetation composition and structure over large regions. Species distribution models (SDMs) are often used for individual species, but projects mapping multiple species are rarer. We compare maps of plant community composition assembled by stacking results from many SDMs with multivariate maps constructed...
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data

ERIC Educational Resources Information Center

Wang, Lijuan

2010-01-01

This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
A constrained multinomial Probit route choice model in the metro network: Formulation, estimation and application

PubMed Central

Zhang, Yongsheng; Wei, Heng; Zheng, Kangning

2017-01-01

Considering that metro network expansion brings us with more alternative routes, it is attractive to integrate the impacts of routes set and the interdependency among alternative routes on route choice probability into route choice modeling. Therefore, the formulation, estimation and application of a constrained multinomial probit (CMNP) route choice model in the metro network are carried out in this paper. The utility function is formulated as three components: the compensatory component is a function of influencing factors; the non-compensatory component measures the impacts of routes set on utility; following a multivariate normal distribution, the covariance of error component is structured into three parts, representing the correlation among routes, the transfer variance of route, and the unobserved variance respectively. Considering multidimensional integrals of the multivariate normal probability density function, the CMNP model is rewritten as Hierarchical Bayes formula and M-H sampling algorithm based Monte Carlo Markov Chain approach is constructed to estimate all parameters. Based on Guangzhou Metro data, reliable estimation results are gained. Furthermore, the proposed CMNP model also shows a good forecasting performance for the route choice probabilities calculation and a good application performance for transfer flow volume prediction. PMID:28591188
A flexible model for multivariate interval-censored survival times with complex correlation structure.

PubMed

Falcaro, Milena; Pickles, Andrew

2007-02-10

We focus on the analysis of multivariate survival times with highly structured interdependency and subject to interval censoring. Such data are common in developmental genetics and genetic epidemiology. We propose a flexible mixed probit model that deals naturally with complex but uninformative censoring. The recorded ages of onset are treated as possibly censored ordinal outcomes with the interval censoring mechanism seen as arising from a coarsened measurement of a continuous variable observed as falling between subject-specific thresholds. This bypasses the requirement for the failure times to be observed as falling into non-overlapping intervals. The assumption of a normal age-of-onset distribution of the standard probit model is relaxed by embedding within it a multivariate Box-Cox transformation whose parameters are jointly estimated with the other parameters of the model. Complex decompositions of the underlying multivariate normal covariance matrix of the transformed ages of onset become possible. The new methodology is here applied to a multivariate study of the ages of first use of tobacco and first consumption of alcohol without parental permission in twins. The proposed model allows estimation of the genetic and environmental effects that are shared by both of these risk behaviours as well as those that are specific. 2006 John Wiley & Sons, Ltd.
Supporting inquiry learning by promoting normative understanding of multivariable causality

NASA Astrophysics Data System (ADS)

Keselman, Alla

2003-11-01

Early adolescents may lack the cognitive and metacognitive skills necessary for effective inquiry learning. In particular, they are likely to have a nonnormative mental model of multivariable causality in which effects of individual variables are neither additive nor consistent. Described here is a software-based intervention designed to facilitate students' metalevel and performance-level inquiry skills by enhancing their understanding of multivariable causality. Relative to an exploration-only group, sixth graders who practiced predicting an outcome (earthquake risk) based on multiple factors demonstrated increased attention to evidence, improved metalevel appreciation of effective strategies, and a trend toward consistent use of a controlled comparison strategy. Sixth graders who also received explicit instruction in making predictions based on multiple factors showed additional improvement in their ability to compare multiple instances as a basis for inferences and constructed the most accurate knowledge of the system. Gains were maintained in transfer tasks. The cognitive skills and metalevel understanding examined here are essential to inquiry learning.
Can multivariate models based on MOAKS predict OA knee pain? Data from the Osteoarthritis Initiative

NASA Astrophysics Data System (ADS)

Luna-Gómez, Carlos D.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Galván-Tejada, Carlos E.; Celaya-Padilla, José M.

2017-03-01

Osteoarthritis is the most common rheumatic disease in the world. Knee pain is the most disabling symptom in the disease, the prediction of pain is one of the targets in preventive medicine, this can be applied to new therapies or treatments. Using the magnetic resonance imaging and the grading scales, a multivariate model based on genetic algorithms is presented. Using a predictive model can be useful to associate minor structure changes in the joint with the future knee pain. Results suggest that multivariate models can be predictive with future knee chronic pain. All models; T0, T1 and T2, were statistically significant, all p values were < 0.05 and all AUC > 0.60.
Multivariate-$t$ nonlinear mixed models with application to censored multi-outcome AIDS studies.

PubMed

Lin, Tsung-I; Wang, Wan-Lun

2017-10-01

In multivariate longitudinal HIV/AIDS studies, multi-outcome repeated measures on each patient over time may contain outliers, and the viral loads are often subject to a upper or lower limit of detection depending on the quantification assays. In this article, we consider an extension of the multivariate nonlinear mixed-effects model by adopting a joint multivariate-$t$ distribution for random effects and within-subject errors and taking the censoring information of multiple responses into account. The proposed model is called the multivariate-$t$ nonlinear mixed-effects model with censored responses (MtNLMMC), allowing for analyzing multi-outcome longitudinal data exhibiting nonlinear growth patterns with censorship and fat-tailed behavior. Utilizing the Taylor-series linearization method, a pseudo-data version of expectation conditional maximization either (ECME) algorithm is developed for iteratively carrying out maximum likelihood estimation. We illustrate our techniques with two data examples from HIV/AIDS studies. Experimental results signify that the MtNLMMC performs favorably compared to its Gaussian analogue and some existing approaches. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Preliminary Multivariable Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip

2010-01-01

Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. Previously, the authors published two single variable cost models based on 19 flight missions. The current paper presents the development of a multi-variable space telescopes cost model. The validity of previously published models are tested. Cost estimating relationships which are and are not significant cost drivers are identified. And, interrelationships between variables are explored
Scalable Joint Models for Reliable Uncertainty-Aware Event Prediction.

PubMed

Soleimani, Hossein; Hensman, James; Saria, Suchi

2017-08-21

Missing data and noisy observations pose significant challenges for reliably predicting events from irregularly sampled multivariate time series (longitudinal) data. Imputation methods, which are typically used for completing the data prior to event prediction, lack a principled mechanism to account for the uncertainty due to missingness. Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations. These approaches, however, make strong parametric assumptions and do not easily scale to multivariate signals with many observations. Our proposed approach consists of several key innovations. First, we develop a flexible and scalable joint model based upon sparse multiple-output Gaussian processes. Unlike state-of-the-art joint models, the proposed model can explain highly challenging structure including non-Gaussian noise while scaling to large data. Second, we derive an optimal policy for predicting events using the distribution of the event occurrence estimated by the joint model. The derived policy trades-off the cost of a delayed detection versus incorrect assessments and abstains from making decisions when the estimated event probability does not satisfy the derived confidence criteria. Experiments on a large dataset show that the proposed framework significantly outperforms state-of-the-art techniques in event prediction.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert Manfred

2013-01-01

A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
DUALITY IN MULTIVARIATE RECEPTOR MODEL. (R831078)

EPA Science Inventory

Multivariate receptor models are used for source apportionment of multiple observations of compositional data of air pollutants that obey mass conservation. Singular value decomposition of the data leads to two sets of eigenvectors. One set of eigenvectors spans a space in whi...
Expression of p53, p21 and cyclin D1 in penile cancer: p53 predicts poor prognosis.

PubMed

Gunia, Sven; Kakies, Christoph; Erbersdobler, Andreas; Hakenberg, Oliver W; Koch, Stefan; May, Matthias

2012-03-01

To evaluate the role of p53, p21 and cyclin D1 expression in patients with penile cancer (PC). Paraffin-embedded tissues from PC specimens from six pathology departments were subjected to a central histopathological review performed by one pathologist. The tissue microarray technique was used for immunostaining which was evaluated by two independent pathologists and correlated with cancer-specific survival (CSS). κ-statistics were used to assess interobserver variability. Uni- and multivariable Cox proportional hazards analysis was applied to assess the independent effects of several prognostic factors on CSS over a median of 32 months (IQR 6-66 months). Specimens and clinical data from 110 men treated surgically for primary PC were collected. p53 staining was positive in 30 and negative in 62 specimens. κ-statistics showed substantial interobserver reproducibility of p53 staining evaluation (κ=0.73; p<0.001). The 5-year CSS rate for the entire study cohort was 74%. Five-year CSS was 84% in p53-negative and 51% in p53-positive PC patients (p=0.003). Multivariable analysis showed p53 (HR=3.20; p=0.041) and pT-stage (HR=4.29; p<0.001) as independent significant prognostic factors for CSS. Cyclin D1 and p21 expression were not correlated with survival. However, incorporating p21 into a multivariable Cox model did contribute to improved model quality for predicting CSS. In patients with PC, the expression of p53 in the primary tumour specimen can be reproducibly assessed and is negatively associated with cancer specific survival.
Multivariate meta-analysis: potential and promise.

PubMed

Jackson, Dan; Riley, Richard; White, Ian R

2011-09-10

The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Multivariate meta-analysis: Potential and promise

PubMed Central

Jackson, Dan; Riley, Richard; White, Ian R

2011-01-01

The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
Stress and Personal Resource as Predictors of the Adjustment of Parents to Autistic Children: A Multivariate Model

ERIC Educational Resources Information Center

Siman-Tov, Ayelet; Kaniel, Shlomo

2011-01-01

The research validates a multivariate model that predicts parental adjustment to coping successfully with an autistic child. The model comprises four elements: parental stress, parental resources, parental adjustment and the child's autism symptoms. 176 parents of children aged between 6 to 16 diagnosed with PDD answered several questionnaires…
Multivariate mixed linear model analysis of longitudinal data: an information-rich statistical technique for analyzing disease resistance data

USDA-ARS?s Scientific Manuscript database

The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...
Decomposing biodiversity data using the Latent Dirichlet Allocation model, a probabilistic multivariate statistical method

Treesearch

Denis Valle; Benjamin Baiser; Christopher W. Woodall; Robin Chazdon; Jerome Chave

2014-01-01

We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates...
Environmental Temperature and Thermal Indices: What Is the Most Effective Predictor of Heat-Related Mortality in Different Geographical Contexts?

PubMed Central

Morabito, Marco; Crisci, Alfonso; Messeri, Alessandro; Capecchi, Valerio; Modesti, Pietro Amedeo; Gensini, Gian Franco; Orlandini, Simone

2014-01-01

The aim of this study is to identify the most effective thermal predictor of heat-related very-elderly mortality in two cities located in different geographical contexts of central Italy. We tested the hypothesis that use of the state-of-the-art rational thermal indices, the Universal Thermal Climate Index (UTCI), might provide an improvement in predicting heat-related mortality with respect to other predictors. Data regarding very elderly people (≥75 years) who died in inland and coastal cities from 2006 to 2008 (May–October) and meteorological and air pollution were obtained from the regional mortality and environmental archives. Rational (UTCI) and direct thermal indices represented by a set of bivariate/multivariate apparent temperature indices were assessed. Correlation analyses and generalized additive models were applied. The Akaike weights were used for the best model selection. Direct multivariate indices showed the highest correlations with UTCI and were also selected as the best thermal predictors of heat-related mortality for both inland and coastal cities. Conversely, the UTCI was never identified as the best thermal predictor. The use of direct multivariate indices, which also account for the extra effect of wind speed and/or solar radiation, revealed the best fitting with all-cause, very-elderly mortality attributable to heat stress. PMID:24523657
Environmental temperature and thermal indices: what is the most effective predictor of heat-related mortality in different geographical contexts?

PubMed

Morabito, Marco; Crisci, Alfonso; Messeri, Alessandro; Capecchi, Valerio; Modesti, Pietro Amedeo; Gensini, Gian Franco; Orlandini, Simone

2014-01-01

The aim of this study is to identify the most effective thermal predictor of heat-related very-elderly mortality in two cities located in different geographical contexts of central Italy. We tested the hypothesis that use of the state-of-the-art rational thermal indices, the Universal Thermal Climate Index (UTCI), might provide an improvement in predicting heat-related mortality with respect to other predictors. Data regarding very elderly people (≥ 75 years) who died in inland and coastal cities from 2006 to 2008 (May-October) and meteorological and air pollution were obtained from the regional mortality and environmental archives. Rational (UTCI) and direct thermal indices represented by a set of bivariate/multivariate apparent temperature indices were assessed. Correlation analyses and generalized additive models were applied. The Akaike weights were used for the best model selection. Direct multivariate indices showed the highest correlations with UTCI and were also selected as the best thermal predictors of heat-related mortality for both inland and coastal cities. Conversely, the UTCI was never identified as the best thermal predictor. The use of direct multivariate indices, which also account for the extra effect of wind speed and/or solar radiation, revealed the best fitting with all-cause, very-elderly mortality attributable to heat stress.

Multivariate co-integration analysis of the Kaya factors in Ghana.

PubMed

Asumadu-Sarkodie, Samuel; Owusu, Phebe Asantewaa

2016-05-01

The fundamental goal of the Government of Ghana's development agenda as enshrined in the Growth and Poverty Reduction Strategy to grow the economy to a middle income status of US$1000 per capita by the end of 2015 could be met by increasing the labour force, increasing energy supplies and expanding the energy infrastructure in order to achieve the sustainable development targets. In this study, a multivariate co-integration analysis of the Kaya factors namely carbon dioxide, total primary energy consumption, population and GDP was investigated in Ghana using vector error correction model with data spanning from 1980 to 2012. Our research results show an existence of long-run causality running from population, GDP and total primary energy consumption to carbon dioxide emissions. However, there is evidence of short-run causality running from population to carbon dioxide emissions. There was a bi-directional causality running from carbon dioxide emissions to energy consumption and vice versa. In other words, decreasing the primary energy consumption in Ghana will directly reduce carbon dioxide emissions. In addition, a bi-directional causality running from GDP to energy consumption and vice versa exists in the multivariate model. It is plausible that access to energy has a relationship with increasing economic growth and productivity in Ghana.
Multivariate Regression Analysis and Slaughter Livestock,

DTIC Science & Technology

AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Evaluating Measurement of Dynamic Constructs: Defining a Measurement Model of Derivatives

PubMed Central

Estabrook, Ryne

2015-01-01

While measurement evaluation has been embraced as an important step in psychological research, evaluating measurement structures with longitudinal data is fraught with limitations. This paper defines and tests a measurement model of derivatives (MMOD), which is designed to assess the measurement structure of latent constructs both for analyses of between-person differences and for the analysis of change. Simulation results indicate that MMOD outperforms existing models for multivariate analysis and provides equivalent fit to data generation models. Additional simulations show MMOD capable of detecting differences in between-person and within-person factor structures. Model features, applications and future directions are discussed. PMID:24364383
Militarism and mortality. An international analysis of arms spending and infant death rates.

PubMed

Woolhandler, S; Himmelstein, D U

1985-06-15

Examination of data from 141 countries showed that infant mortality rates for 1979 were positively correlated with the proportion of gross national product devoted to military spending (r = 0.23, p less than 0.01) and negatively correlated with indicators of economic development, health resources, and social spending. In a multivariate analysis controlling for per caput gross national product, arms spending remained a significant positive predictor of infant mortality rate (p less than 0.0001), while the proportion of the population with access to clean water, the number of teachers per head, and caloric consumption per head were negative predictors. The multivariate model accounted for much of the observed variance in infant mortality rate (R2 = 0.78, p less than 0.0001), and showed good fit to similar data for the year 1972 (R2 = 0.80, p less than 0.0001). The model was also predictive of infant mortality rates in subgroup analysis of underdeveloped, middle developed, and developed nations. Analysis of time trends confirmed that an increase in military spending presages a poor record of improvement in infant mortality rate. These findings support the hypothesis that arms spending is causally related to infant mortality.
Modeling stochastic frontier based on vine copulas

NASA Astrophysics Data System (ADS)

Constantino, Michel; Candido, Osvaldo; Tabak, Benjamin M.; da Costa, Reginaldo Brito

2017-11-01

This article models a production function and analyzes the technical efficiency of listed companies in the United States, Germany and England between 2005 and 2012 based on the vine copula approach. Traditional estimates of the stochastic frontier assume that data is multivariate normally distributed and there is no source of asymmetry. The proposed method based on vine copulas allow us to explore different types of asymmetry and multivariate distribution. Using data on product, capital and labor, we measure the relative efficiency of the vine production function and estimate the coefficient used in the stochastic frontier literature for comparison purposes. This production vine copula predicts the value added by firms with given capital and labor in a probabilistic way. It thereby stands in sharp contrast to the production function, where the output of firms is completely deterministic. The results show that, on average, S&P500 companies are more efficient than companies listed in England and Germany, which presented similar average efficiency coefficients. For comparative purposes, the traditional stochastic frontier was estimated and the results showed discrepancies between the coefficients obtained by the application of the two methods, traditional and frontier-vine, opening new paths of non-linear research.
[Influences of environmental factors and interaction of several chemokines gene-environmental on systemic lupus erythematosus].

PubMed

Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui

2004-11-01

To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
Groundwater source contamination mechanisms: Physicochemical profile clustering, risk factor analysis and multivariate modelling

NASA Astrophysics Data System (ADS)

Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.

2014-04-01

An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.
Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

PubMed

Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

2017-01-01

The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.
Adjustment of automatic control systems of production facilities at coal processing plants using multivariant physico- mathematical models

NASA Astrophysics Data System (ADS)

Evtushenko, V. F.; Myshlyaev, L. P.; Makarov, G. V.; Ivushkin, K. A.; Burkova, E. V.

2016-10-01

The structure of multi-variant physical and mathematical models of control system is offered as well as its application for adjustment of automatic control system (ACS) of production facilities on the example of coal processing plant.
A simplified parsimonious higher order multivariate Markov chain model with new convergence condition

NASA Astrophysics Data System (ADS)

Wang, Chao; Yang, Chuan-sheng

2017-09-01

In this paper, we present a simplified parsimonious higher-order multivariate Markov chain model with new convergence condition. (TPHOMMCM-NCC). Moreover, estimation method of the parameters in TPHOMMCM-NCC is give. Numerical experiments illustrate the effectiveness of TPHOMMCM-NCC.
Source apportionments of PM2.5 organic carbon using molecular marker Positive Matrix Factorization and comparison of results from different receptor models

NASA Astrophysics Data System (ADS)

Heo, Jongbae; Dulger, Muaz; Olson, Michael R.; McGinnis, Jerome E.; Shelton, Brandon R.; Matsunaga, Aiko; Sioutas, Constantinos; Schauer, James J.

2013-07-01

Four hundred fine particulate matter (PM2.5) samples collected over a 1-year period at two sites in the Los Angeles Basin were analyzed for organic carbon (OC), elemental carbon (EC), water soluble organic carbon (WSOC) and organic molecular markers. The results were used in a Positive Matrix Factorization (PMF) receptor model to obtain daily, monthly and annual average source contributions to PM2.5 OC. Results of the PMF model showed similar source categories with comparable year-long contributions to PM2.5 OC across the sites. Five source categories providing reasonably stable profiles were identified: mobile, wood smoke, primary biogenic, and two types of secondary organic carbon (SOC) (i.e., anthropogenic and biogenic emissions). Total primary emission factors and total SOC factors contributed approximately 60% and 40%, respectively, to the annual-average OC concentrations. Primary sources showed strong seasonal patterns with high winter peaks and low summer peaks, while SOC showed a reverse pattern with highs in the spring and summer in the region. Interestingly, smoke from forest fires which occurred episodically in California during the summer and fall of 2009 was identified and combined with the primary biogenic source as one distinct factor to the OC budget. The PMF resolved factors were further investigated and compared to a chemical mass balance (CMB) model and a second multi-variant receptor model (UNMIX) using molecular markers considered in the PMF. Good agreement between the source contribution from mobile sources and biomass burning for three models were obtained, providing additional weight of evidence that these source apportionment techniques are sufficiently accurate for policy development. However, the CMB model did not quantify primary biogenic emissions, which were included in other sources with the SOC. Both multivariate receptor models, the PMF and the UNMIX, were unable to separate source contributions from diesel and gasoline engines.
Various forms of indexing HDMR for modelling multivariate classification problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aksu, Çağrı; Tunga, M. Alper

2014-12-10

The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled.more » In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory.« less
Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions

PubMed Central

2013-01-01

Background Recently, one of the greatest challenges in genome-wide association studies is to detect gene-gene and/or gene-environment interactions for common complex human diseases. Ritchie et al. (2001) proposed multifactor dimensionality reduction (MDR) method for interaction analysis. MDR is a combinatorial approach to reduce multi-locus genotypes into high-risk and low-risk groups. Although MDR has been widely used for case-control studies with binary phenotypes, several extensions have been proposed. One of these methods, a generalized MDR (GMDR) proposed by Lou et al. (2007), allows adjusting for covariates and applying to both dichotomous and continuous phenotypes. GMDR uses the residual score of a generalized linear model of phenotypes to assign either high-risk or low-risk group, while MDR uses the ratio of cases to controls. Methods In this study, we propose multivariate GMDR, an extension of GMDR for multivariate phenotypes. Jointly analysing correlated multivariate phenotypes may have more power to detect susceptible genes and gene-gene interactions. We construct generalized estimating equations (GEE) with multivariate phenotypes to extend generalized linear models. Using the score vectors from GEE we discriminate high-risk from low-risk groups. We applied the multivariate GMDR method to the blood pressure data of the 7,546 subjects from the Korean Association Resource study: systolic blood pressure (SBP) and diastolic blood pressure (DBP). We compare the results of multivariate GMDR for SBP and DBP to the results from separate univariate GMDR for SBP and DBP, respectively. We also applied the multivariate GMDR method to the repeatedly measured hypertension status from 5,466 subjects and compared its result with those of univariate GMDR at each time point. Results Results from the univariate GMDR and multivariate GMDR in two-locus model with both blood pressures and hypertension phenotypes indicate best combinations of SNPs whose interaction has significant association with risk for high blood pressures or hypertension. Although the test balanced accuracy (BA) of multivariate analysis was not always greater than that of univariate analysis, the multivariate BAs were more stable with smaller standard deviations. Conclusions In this study, we have developed multivariate GMDR method using GEE approach. It is useful to use multivariate GMDR with correlated multiple phenotypes of interests. PMID:24565370
Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models

PubMed Central

Wang, Yifan; Liu, Aiyi; Mills, James L.; Boehnke, Michael; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Xiong, Momiao; Wu, Colin O.; Fan, Ruzong

2015-01-01

In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks’s Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.

PubMed

Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong

2015-05-01

In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. © 2015 WILEY PERIODICALS, INC.
Estimation and model selection of semiparametric multivariate survival functions under general censorship.

PubMed

Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

2010-07-01

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Estimation and model selection of semiparametric multivariate survival functions under general censorship

PubMed Central

Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang

2013-01-01

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
Usual Dietary Intakes: SAS Macros for Fitting Multivariate Measurement Error Models & Estimating Multivariate Usual Intake Distributions

Cancer.gov

The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.
Comparative Robustness of Recent Methods for Analyzing Multivariate Repeated Measures Designs

ERIC Educational Resources Information Center

Seco, Guillermo Vallejo; Gras, Jaime Arnau; Garcia, Manuel Ato

2007-01-01

This study evaluated the robustness of two recent methods for analyzing multivariate repeated measures when the assumptions of covariance homogeneity and multivariate normality are violated. Specifically, the authors' work compares the performance of the modified Brown-Forsythe (MBF) procedure and the mixed-model procedure adjusted by the…
Mapping eQTL Networks with Mixed Graphical Markov Models

PubMed Central

Tur, Inma; Roverato, Alberto; Castelo, Robert

2014-01-01

Expression quantitative trait loci (eQTL) mapping constitutes a challenging problem due to, among other reasons, the high-dimensional multivariate nature of gene-expression traits. Next to the expression heterogeneity produced by confounding factors and other sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular, and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of all of these factors to end up with a network of direct associations connecting the path from genotype to phenotype. In this article we approach this challenge with mixed graphical Markov models, higher-order conditional independences, and q-order correlation graphs. These models show that additive genetic effects propagate through the network as function of gene–gene correlations. Our estimation of the eQTL network underlying a well-studied yeast data set leads to a sparse structure with more direct genetic and regulatory associations that enable a straightforward comparison of the genetic control of gene expression across chromosomes. Interestingly, it also reveals that eQTLs explain most of the expression variability of network hub genes. PMID:25271303

How to compare cross-lagged associations in a multilevel autoregressive model.

PubMed

Schuurman, Noémi K; Ferrer, Emilio; de Boer-Sonnenschein, Mieke; Hamaker, Ellen L

2016-06-01

By modeling variables over time it is possible to investigate the Granger-causal cross-lagged associations between variables. By comparing the standardized cross-lagged coefficients, the relative strength of these associations can be evaluated in order to determine important driving forces in the dynamic system. The aim of this study was twofold: first, to illustrate the added value of a multilevel multivariate autoregressive modeling approach for investigating these associations over more traditional techniques; and second, to discuss how the coefficients of the multilevel autoregressive model should be standardized for comparing the strength of the cross-lagged associations. The hierarchical structure of multilevel multivariate autoregressive models complicates standardization, because subject-based statistics or group-based statistics can be used to standardize the coefficients, and each method may result in different conclusions. We argue that in order to make a meaningful comparison of the strength of the cross-lagged associations, the coefficients should be standardized within persons. We further illustrate the bivariate multilevel autoregressive model and the standardization of the coefficients, and we show that disregarding individual differences in dynamics can prove misleading, by means of an empirical example on experienced competence and exhaustion in persons diagnosed with burnout. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Landslide susceptibility modeling applying machine learning methods: A case study from Longju in the Three Gorges Reservoir area, China

NASA Astrophysics Data System (ADS)

Zhou, Chao; Yin, Kunlong; Cao, Ying; Ahmed, Bayes; Li, Yuanyao; Catani, Filippo; Pourghasemi, Hamid Reza

2018-03-01

Landslide is a common natural hazard and responsible for extensive damage and losses in mountainous areas. In this study, Longju in the Three Gorges Reservoir area in China was taken as a case study for landslide susceptibility assessment in order to develop effective risk prevention and mitigation strategies. To begin, 202 landslides were identified, including 95 colluvial landslides and 107 rockfalls. Twelve landslide causal factor maps were prepared initially, and the relationship between these factors and each landslide type was analyzed using the information value model. Later, the unimportant factors were selected and eliminated using the information gain ratio technique. The landslide locations were randomly divided into two groups: 70% for training and 30% for verifying. Two machine learning models: the support vector machine (SVM) and artificial neural network (ANN), and a multivariate statistical model: the logistic regression (LR), were applied for landslide susceptibility modeling (LSM) for each type. The LSM index maps, obtained from combining the assessment results of the two landslide types, were classified into five levels. The performance of the LSMs was evaluated using the receiver operating characteristics curve and Friedman test. Results show that the elimination of noise-generating factors and the separated modeling of each landslide type have significantly increased the prediction accuracy. The machine learning models outperformed the multivariate statistical model and SVM model was found ideal for the case study area.
Critical elements on fitting the Bayesian multivariate Poisson Lognormal model

NASA Astrophysics Data System (ADS)

Zamzuri, Zamira Hasanah binti

2015-10-01

Motivated by a problem on fitting multivariate models to traffic accident data, a detailed discussion of the Multivariate Poisson Lognormal (MPL) model is presented. This paper reveals three critical elements on fitting the MPL model: the setting of initial estimates, hyperparameters and tuning parameters. These issues have not been highlighted in the literature. Based on simulation studies conducted, we have shown that to use the Univariate Poisson Model (UPM) estimates as starting values, at least 20,000 iterations are needed to obtain reliable final estimates. We also illustrated the sensitivity of the specific hyperparameter, which if it is not given extra attention, may affect the final estimates. The last issue is regarding the tuning parameters where they depend on the acceptance rate. Finally, a heuristic algorithm to fit the MPL model is presented. This acts as a guide to ensure that the model works satisfactorily given any data set.
Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data

NASA Technical Reports Server (NTRS)

Baker, W. E.; Bloom, S. C.; Nestler, M. S.

1985-01-01

A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.
Multivariate Bayesian modeling of known and unknown causes of events--an application to biosurveillance.

PubMed

Shen, Yanna; Cooper, Gregory F

2012-09-01

This paper investigates Bayesian modeling of known and unknown causes of events in the context of disease-outbreak detection. We introduce a multivariate Bayesian approach that models multiple evidential features of every person in the population. This approach models and detects (1) known diseases (e.g., influenza and anthrax) by using informative prior probabilities and (2) unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities. We report the results of simulation experiments which support that this modeling method can improve the detection of new disease outbreaks in a population. A contribution of this paper is that it introduces a multivariate Bayesian approach for jointly modeling both known and unknown causes of events. Such modeling has general applicability in domains where the space of known causes is incomplete. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
FACTOR ANALYTIC MODELS OF CLUSTERED MULTIVARIATE DATA WITH INFORMATIVE CENSORING

EPA Science Inventory

This paper describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censorin...
Predictive value of sperm morphology and progressively motile sperm count for pregnancy outcomes in intrauterine insemination.

PubMed

Lemmens, Louise; Kos, Snjezana; Beijer, Cornelis; Brinkman, Jacoline W; van der Horst, Frans A L; van den Hoven, Leonie; Kieslinger, Dorit C; van Trooyen-van Vrouwerff, Netty J; Wolthuis, Albert; Hendriks, Jan C M; Wetzels, Alex M M

2016-06-01

To investigate the value of sperm parameters to predict an ongoing pregnancy outcome in couples treated with intrauterine insemination (IUI), during a methodologically stable period of time. Retrospective, observational study with logistic regression analyses. University hospital. A total of 1,166 couples visiting the fertility laboratory for their first IUI episode, including 4,251 IUI cycles. None. Sperm morphology, total progressively motile sperm count (TPMSC), and number of inseminated progressively motile spermatozoa (NIPMS); odds ratios (ORs) of the sperm parameters after the first IUI cycle and the first finished IUI episode; discriminatory accuracy of the multivariable model. None of the sperm parameters was of predictive value for pregnancy after the first IUI cycle. In the first finished IUI episode, a positive relationship was found for ≤4% of morphologically normal spermatozoa (OR 1.39) and a moderate NIPMS (5-10 million; OR 1.73). Low NIPMS showed a negative relation (≤1 million; OR 0.42). The TPMSC had no predictive value. The multivariable model (i.e., sperm morphology, NIPMS, female age, male age, and the number of cycles in the episode) had a moderate discriminatory accuracy (area under the curve 0.73). Intrauterine insemination is especially relevant for couples with moderate male factor infertility (sperm morphology ≤4%, NIPMS 5-10 million). In the multivariable model, however, the predictive power of these sperm parameters is rather low. Copyright © 2016 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.
Enhancing e-waste estimates: improving data quality by multivariate Input-Output Analysis.

PubMed

Wang, Feng; Huisman, Jaco; Stevels, Ab; Baldé, Cornelis Peter

2013-11-01

Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input-Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
A Competing Risk Model of First Failure Site after Definitive Chemoradiation Therapy for Locally Advanced Non-Small Cell Lung Cancer.

PubMed

Nygård, Lotte; Vogelius, Ivan R; Fischer, Barbara M; Kjær, Andreas; Langer, Seppo W; Aznar, Marianne C; Persson, Gitte F; Bentzen, Søren M

2018-04-01

The aim of the study was to build a model of first failure site- and lesion-specific failure probability after definitive chemoradiotherapy for inoperable NSCLC. We retrospectively analyzed 251 patients receiving definitive chemoradiotherapy for NSCLC at a single institution between 2009 and 2015. All patients were scanned by fludeoxyglucose positron emission tomography/computed tomography for radiotherapy planning. Clinical patient data and fludeoxyglucose positron emission tomography standardized uptake values from primary tumor and nodal lesions were analyzed by using multivariate cause-specific Cox regression. In patients experiencing locoregional failure, multivariable logistic regression was applied to assess risk of each lesion being the first site of failure. The two models were used in combination to predict probability of lesion failure accounting for competing events. Adenocarcinoma had a lower hazard ratio (HR) of locoregional failure than squamous cell carcinoma (HR = 0.45, 95% confidence interval [CI]: 0.26-0.76, p = 0.003). Distant failures were more common in the adenocarcinoma group (HR = 2.21, 95% CI: 1.41-3.48, p < 0.001). Multivariable logistic regression of individual lesions at the time of first failure showed that primary tumors were more likely to fail than lymph nodes (OR = 12.8, 95% CI: 5.10-32.17, p < 0.001). Increasing peak standardized uptake value was significantly associated with lesion failure (OR = 1.26 per unit increase, 95% CI: 1.12-1.40, p < 0.001). The electronic model is available at http://bit.ly/LungModelFDG. We developed a failure site-specific competing risk model based on patient- and lesion-level characteristics. Failure patterns differed between adenocarcinoma and squamous cell carcinoma, illustrating the limitation of aggregating them into NSCLC. Failure site-specific models add complementary information to conventional prognostic models. Copyright © 2018 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.
An Examination of the Domain of Multivariable Functions Using the Pirie-Kieren Model

ERIC Educational Resources Information Center

Sengul, Sare; Yildiz, Sevda Goktepe

2016-01-01

The aim of this study is to employ the Pirie-Kieren model so as to examine the understandings relating to the domain of multivariable functions held by primary school mathematics preservice teachers. The data obtained was categorized according to Pirie-Kieren model and demonstrated visually in tables and bar charts. The study group consisted of…
Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs

Treesearch

Andrew F. Howard; Daniel A. Yaussy

1986-01-01

A multivariate regression model was developed to predict green board-foot yields for the common grades of factory lumber processed from yellow birch factory-grade logs. The model incorporates the standard log measurements of scaling diameter, length, proportion of scalable defects, and the assigned USDA Forest Service log grade. Differences in yields between band and...
A Multivariate Model for the Meta-Analysis of Study Level Survival Data at Multiple Times

ERIC Educational Resources Information Center

Jackson, Dan; Rollins, Katie; Coughlin, Patrick

2014-01-01

Motivated by our meta-analytic dataset involving survival rates after treatment for critical leg ischemia, we develop and apply a new multivariate model for the meta-analysis of study level survival data at multiple times. Our data set involves 50 studies that provide mortality rates at up to seven time points, which we model simultaneously, and…
Analytical framework for reconstructing heterogeneous environmental variables from mammal community structure.

PubMed

Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C

2015-01-01

We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-06-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

USGS Publications Warehouse

Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

2008-01-01

Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Multivariate-Statistical Assessment of Heavy Metals for Agricultural Soils in Northern China

PubMed Central

Yang, Pingguo; Yang, Miao; Mao, Renzhao; Shao, Hongbo

2014-01-01

The study evaluated eight heavy metals content and soil pollution from agricultural soils in northern China. Multivariate and geostatistical analysis approaches were used to determine the anthropogenic and natural contribution of soil heavy metal concentrations. Single pollution index and integrated pollution index could be used to evaluate soil heavy metal risk. The results show that the first factor explains 27.3% of the eight soil heavy metals with strong positive loadings on Cu, Zn, and Cd, which indicates that Cu, Zn, and Cd are associated with and controlled by anthropic activities. The average value of heavy metal is lower than the second grade standard values of soil environmental quality standards in China. Single pollution index is lower than 1, and the Nemerow integrated pollution index is 0.305, which means that study area has not been polluted. The semivariograms of soil heavy metal single pollution index fitted spherical and exponential models. The variable ratio of single pollution index showed moderately spatial dependence. Heavy metal contents showed relative safety in the study area. PMID:24892058
A Multivariate Descriptive Model of Motivation for Orthodontic Treatment.

ERIC Educational Resources Information Center

Hackett, Paul M. W.; And Others

1993-01-01

Motivation for receiving orthodontic treatment was studied among 109 young adults, and a multivariate model of the process is proposed. The combination of smallest scale analysis and Partial Order Scalogram Analysis by base Coordinates (POSAC) illustrates an interesting methodology for health treatment studies and explores motivation for dental…
Mathematical Formulation of Multivariate Euclidean Models for Discrimination Methods.

ERIC Educational Resources Information Center

Mullen, Kenneth; Ennis, Daniel M.

1987-01-01

Multivariate models for the triangular and duo-trio methods are described, and theoretical methods are compared to a Monte Carlo simulation. Implications are discussed for a new theory of multidimensional scaling which challenges the traditional assumption that proximity measures and perceptual distances are monotonically related. (Author/GDC)
A Multivariate Model of Parent-Adolescent Relationship Variables in Early Adolescence

ERIC Educational Resources Information Center

McKinney, Cliff; Renk, Kimberly

2011-01-01

Given the importance of predicting outcomes for early adolescents, this study examines a multivariate model of parent-adolescent relationship variables, including parenting, family environment, and conflict. Participants, who completed measures assessing these variables, included 710 culturally diverse 11-14-year-olds who were attending a middle…
Identification of flexible structures by frequency-domain observability range context

NASA Astrophysics Data System (ADS)

Hopkins, M. A.

2013-04-01

The well known frequency-domain observability range space extraction (FORSE) algorithm provides a powerful multivariable system-identification tool with inherent flexibility, to create state-space models from frequency-response data (FRD). This paper presents a method of using FORSE to create "context models" of a lightly damped system, from which models of individual resonant modes can be extracted. Further, it shows how to combine the extracted models of many individual modes into one large state-space model. Using this method, the author has created very high-order state-space models that accurately match measured FRD over very broad bandwidths, i.e., resonant peaks spread across five orders-of-magnitude of frequency bandwidth.

Classical least squares multivariate spectral analysis

DOEpatents

Haaland, David M.

2002-01-01

An improved classical least squares multivariate spectral analysis method that adds spectral shapes describing non-calibrated components and system effects (other than baseline corrections) present in the analyzed mixture to the prediction phase of the method. These improvements decrease or eliminate many of the restrictions to the CLS-type methods and greatly extend their capabilities, accuracy, and precision. One new application of PACLS includes the ability to accurately predict unknown sample concentrations when new unmodeled spectral components are present in the unknown samples. Other applications of PACLS include the incorporation of spectrometer drift into the quantitative multivariate model and the maintenance of a calibration on a drifting spectrometer. Finally, the ability of PACLS to transfer a multivariate model between spectrometers is demonstrated.
Robust tests for multivariate factorial designs under heteroscedasticity.

PubMed

Vallejo, Guillermo; Ato, Manuel

2012-06-01

The question of how to analyze several multivariate normal mean vectors when normality and covariance homogeneity assumptions are violated is considered in this article. For the two-way MANOVA layout, we address this problem adapting results presented by Brunner, Dette, and Munk (BDM; 1997) and Vallejo and Ato (modified Brown-Forsythe [MBF]; 2006) in the context of univariate factorial and split-plot designs and a multivariate version of the linear model (MLM) to accommodate heterogeneous data. Furthermore, we compare these procedures with the Welch-James (WJ) approximate degrees of freedom multivariate statistics based on ordinary least squares via Monte Carlo simulation. Our numerical studies show that of the methods evaluated, only the modified versions of the BDM and MBF procedures were robust to violations of underlying assumptions. The MLM approach was only occasionally liberal, and then by only a small amount, whereas the WJ procedure was often liberal if the interactive effects were involved in the design, particularly when the number of dependent variables increased and total sample size was small. On the other hand, it was also found that the MLM procedure was uniformly more powerful than its most direct competitors. The overall success rate was 22.4% for the BDM, 36.3% for the MBF, and 45.0% for the MLM.
Hierarchical Bayesian spatial models for predicting multiple forest variables using waveform LiDAR, hyperspectral imagery, and large inventory datasets

USGS Publications Warehouse

Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.

2013-01-01

In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Multivariate missing data in hydrology - Review and applications

NASA Astrophysics Data System (ADS)

Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.

2017-12-01

Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
A novel latent gaussian copula framework for modeling spatial correlation in quantized SAR imagery with applications to ATR

NASA Astrophysics Data System (ADS)

Thelen, Brian T.; Xique, Ismael J.; Burns, Joseph W.; Goley, G. Steven; Nolan, Adam R.; Benson, Jonathan W.

2017-04-01

With all of the new remote sensing modalities available, and with ever increasing capabilities and frequency of collection, there is a desire to fundamentally understand/quantify the information content in the collected image data relative to various exploitation goals, such as detection/classification. A fundamental approach for this is the framework of Bayesian decision theory, but a daunting challenge is to have significantly flexible and accurate multivariate models for the features and/or pixels that capture a wide assortment of distributions and dependen- cies. In addition, data can come in the form of both continuous and discrete representations, where the latter is often generated based on considerations of robustness to imaging conditions and occlusions/degradations. In this paper we propose a novel suite of "latent" models fundamentally based on multivariate Gaussian copula models that can be used for quantized data from SAR imagery. For this Latent Gaussian Copula (LGC) model, we derive an approximate, maximum-likelihood estimation algorithm and demonstrate very reasonable estimation performance even for the larger images with many pixels. However applying these LGC models to large dimen- sions/images within a Bayesian decision/classification theory is infeasible due to the computational/numerical issues in evaluating the true full likelihood, and we propose an alternative class of novel pseudo-likelihoood detection statistics that are computationally feasible. We show in a few simple examples that these statistics have the potential to provide very good and robust detection/classification performance. All of this framework is demonstrated on a simulated SLICY data set, and the results show the importance of modeling the dependencies, and of utilizing the pseudo-likelihood methods.
A k-omega multivariate beta PDF for supersonic turbulent combustion

NASA Technical Reports Server (NTRS)

Alexopoulos, G. A.; Baurle, R. A.; Hassan, H. A.

1993-01-01

In a recent attempt by the authors at predicting measurements in coaxial supersonic turbulent reacting mixing layers involving H2 and air, a number of discrepancies involving the concentrations and their variances were noted. The turbulence model employed was a one-equation model based on the turbulent kinetic energy. This required the specification of a length scale. In an attempt at detecting the cause of the discrepancy, a coupled k-omega joint probability density function (PDF) is employed in conjunction with a Navier-Stokes solver. The results show that improvements resulting from a k-omega model are quite modest.
A Mulitivariate Statistical Model Describing the Compound Nature of Soil Moisture Drought

NASA Astrophysics Data System (ADS)

Manning, Colin; Widmann, Martin; Bevacqua, Emanuele; Maraun, Douglas; Van Loon, Anne; Vrac, Mathieu

2017-04-01

Soil moisture in Europe acts to partition incoming energy into sensible and latent heat fluxes, thereby exerting a large influence on temperature variability. Soil moisture is predominantly controlled by precipitation and evapotranspiration. When these meteorological variables are accumulated over different timescales, their joint multivariate distribution and dependence structure can be used to provide information of soil moisture. We therefore consider soil moisture drought as a compound event of meteorological drought (deficits of precipitation) and heat waves, or more specifically, periods of high Potential Evapotraspiration (PET). We present here a statistical model of soil moisture based on Pair Copula Constructions (PCC) that can describe the dependence amongst soil moisture and its contributing meteorological variables. The model is designed in such a way that it can account for concurrences of meteorological drought and heat waves and describe the dependence between these conditions at a local level. The model is composed of four variables; daily soil moisture (h); a short term and a long term accumulated precipitation variable (Y1 and Y_2) that account for the propagation of meteorological drought to soil moisture drought; and accumulated PET (Y_3), calculated using the Penman Monteith equation, which can represent the effect of a heat wave on soil conditions. Copula are multivariate distribution functions that allow one to model the dependence structure of given variables separately from their marginal behaviour. PCCs then allow in theory for the formulation of a multivariate distribution of any dimension where the multivariate distribution is decomposed into a product of marginal probability density functions and two-dimensional copula, of which some are conditional. We apply PCC here in such a way that allows us to provide estimates of h and their uncertainty through conditioning on the Y in the form h=h|y_1,y_2,y_3 (1) Applying the model to various Fluxnet sites across Europe, we find the model has good skill and can particularly capture periods of low soil moisture well. We illustrate the relevance of the dependence structure of these Y variables to soil moisture and show how it may be generalised to offer information of soil moisture on a widespread scale where few observations of soil moisture exist. We then present results from a validation study of a selection of EURO CORDEX climate models where we demonstrate the skill of these models in representing these dependencies and so offer insight into the skill seen in the representation of soil moisture in these models.
Integrated environmental monitoring and multivariate data analysis-A case study.

PubMed

Eide, Ingvar; Westad, Frank; Nilssen, Ingunn; de Freitas, Felipe Sales; Dos Santos, Natalia Gomes; Dos Santos, Francisco; Cabral, Marcelo Montenegro; Bicego, Marcia Caruso; Figueira, Rubens; Johnsen, Ståle

2017-03-01

The present article describes integration of environmental monitoring and discharge data and interpretation using multivariate statistics, principal component analysis (PCA), and partial least squares (PLS) regression. The monitoring was carried out at the Peregrino oil field off the coast of Brazil. One sensor platform and 3 sediment traps were placed on the seabed. The sensors measured current speed and direction, turbidity, temperature, and conductivity. The sediment trap samples were used to determine suspended particulate matter that was characterized with respect to a number of chemical parameters (26 alkanes, 16 PAHs, N, C, calcium carbonate, and Ba). Data on discharges of drill cuttings and water-based drilling fluid were provided on a daily basis. The monitoring was carried out during 7 campaigns from June 2010 to October 2012, each lasting 2 to 3 months due to the capacity of the sediment traps. The data from the campaigns were preprocessed, combined, and interpreted using multivariate statistics. No systematic difference could be observed between campaigns or traps despite the fact that the first campaign was carried out before drilling, and 1 of 3 sediment traps was located in an area not expected to be influenced by the discharges. There was a strong covariation between suspended particulate matter and total N and organic C suggesting that the majority of the sediment samples had a natural and biogenic origin. Furthermore, the multivariate regression showed no correlation between discharges of drill cuttings and sediment trap or turbidity data taking current speed and direction into consideration. Because of this lack of correlation with discharges from the drilling location, a more detailed evaluation of chemical indicators providing information about origin was carried out in addition to numerical modeling of dispersion and deposition. The chemical indicators and the modeling of dispersion and deposition support the conclusions from the multivariate statistics. Integr Environ Assess Manag 2017;13:387-395. © 2016 SETAC. © 2016 SETAC.
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

DOE PAGES

Fierce, Laura; McGraw, Robert L.

2017-07-26

Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fierce, Laura; McGraw, Robert L.

Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
[A correction method of baseline drift of discrete spectrum of NIR].

PubMed

Hu, Ai-Qin; Yuan, Hong-Fu; Song, Chun-Feng; Li, Xiao-Yu

2014-10-01

In the present paper, a new correction method of baseline drift of discrete spectrum is proposed by combination of cubic spline interpolation and first order derivative. A fitting spectrum is constructed by cubic spline interpolation, using the datum in discrete spectrum as interpolation nodes. The fitting spectrum is differentiable. First order derivative is applied to the fitting spectrum to calculate derivative spectrum. The spectral wavelengths which are the same as the original discrete spectrum were taken out from the derivative spectrum to constitute the first derivative spectra of the discrete spectra, thereby to correct the baseline drift of the discrete spectra. The effects of the new method were demonstrated by comparison of the performances of multivariate models built using original spectra, direct differential spectra and the spectra pretreated by the new method. The results show that negative effects on the performance of multivariate model caused by baseline drift of discrete spectra can be effectively eliminated by the new method.
Effect of abdominopelvic abscess drain size on drainage time and probability of occlusion

PubMed Central

Rotman, Jessica A.; Getrajdman, George I.; Maybody, Majid; Erinjeri, Joseph P.; Yarmohammadi, Hooman; Sofocleous, Constantinos T.; Solomon, Stephen B.; Boas, F. Edward

2016-01-01

Background The purpose of this study is to determine whether larger abdominopelvic abscess drains reduce the time required for abscess resolution, or the probability of tube occlusion. Methods 144 consecutive patients who underwent abscess drainage at a single institution were reviewed retrospectively. Results: Larger initial drain size did not reduce drainage time, drain occlusion, or drain exchanges (p>0.05). Subgroup analysis did not find any type of collection that benefitted from larger drains. A multivariate model predicting drainage time showed that large collections (>200 ml) required 16 days longer drainage time than small collections (<50 ml). Collections with a fistula to bowel required 17 days longer drainage time than collections without a fistula. Initial drain size and the viscosity of the fluid in the collection had no significant effect on drainage time in the multivariate model. Conclusions 8 F drains are adequate for initial drainage of most serous and serosanguineous collections. 10 F drains are adequate for initial drainage of most purulent or bloody collections. PMID:27634422
A mixed-effects regression model for longitudinal multivariate ordinal data.

PubMed

Liu, Li C; Hedeker, Donald

2006-03-01

A mixed-effects item response theory model that allows for three-level multivariate ordinal outcomes and accommodates multiple random subject effects is proposed for analysis of multivariate ordinal outcomes in longitudinal studies. This model allows for the estimation of different item factor loadings (item discrimination parameters) for the multiple outcomes. The covariates in the model do not have to follow the proportional odds assumption and can be at any level. Assuming either a probit or logistic response function, maximum marginal likelihood estimation is proposed utilizing multidimensional Gauss-Hermite quadrature for integration of the random effects. An iterative Fisher scoring solution, which provides standard errors for all model parameters, is used. An analysis of a longitudinal substance use data set, where four items of substance use behavior (cigarette use, alcohol use, marijuana use, and getting drunk or high) are repeatedly measured over time, is used to illustrate application of the proposed model.
A multivariate spatial mixture model for areal data: examining regional differences in standardized test scores

PubMed Central

Neelon, Brian; Gelfand, Alan E.; Miranda, Marie Lynn

2013-01-01

Summary Researchers in the health and social sciences often wish to examine joint spatial patterns for two or more related outcomes. Examples include infant birth weight and gestational length, psychosocial and behavioral indices, and educational test scores from different cognitive domains. We propose a multivariate spatial mixture model for the joint analysis of continuous individual-level outcomes that are referenced to areal units. The responses are modeled as a finite mixture of multivariate normals, which accommodates a wide range of marginal response distributions and allows investigators to examine covariate effects within subpopulations of interest. The model has a hierarchical structure built at the individual level (i.e., individuals are nested within areal units), and thus incorporates both individual- and areal-level predictors as well as spatial random effects for each mixture component. Conditional autoregressive (CAR) priors on the random effects provide spatial smoothing and allow the shape of the multivariate distribution to vary flexibly across geographic regions. We adopt a Bayesian modeling approach and develop an efficient Markov chain Monte Carlo model fitting algorithm that relies primarily on closed-form full conditionals. We use the model to explore geographic patterns in end-of-grade math and reading test scores among school-age children in North Carolina. PMID:26401059
Graphical Models for Ordinal Data

PubMed Central

Guo, Jian; Levina, Elizaveta; Michailidis, George; Zhu, Ji

2014-01-01

A graphical model for ordinal variables is considered, where it is assumed that the data are generated by discretizing the marginal distributions of a latent multivariate Gaussian distribution. The relationships between these ordinal variables are then described by the underlying Gaussian graphical model and can be inferred by estimating the corresponding concentration matrix. Direct estimation of the model is computationally expensive, but an approximate EM-like algorithm is developed to provide an accurate estimate of the parameters at a fraction of the computational cost. Numerical evidence based on simulation studies shows the strong performance of the algorithm, which is also illustrated on data sets on movie ratings and an educational survey. PMID:26120267
Data driven discrete-time parsimonious identification of a nonlinear state-space model for a weakly nonlinear system with short data record

NASA Astrophysics Data System (ADS)

Relan, Rishi; Tiels, Koen; Marconato, Anna; Dreesen, Philippe; Schoukens, Johan

2018-05-01

Many real world systems exhibit a quasi linear or weakly nonlinear behavior during normal operation, and a hard saturation effect for high peaks of the input signal. In this paper, a methodology to identify a parsimonious discrete-time nonlinear state space model (NLSS) for the nonlinear dynamical system with relatively short data record is proposed. The capability of the NLSS model structure is demonstrated by introducing two different initialisation schemes, one of them using multivariate polynomials. In addition, a method using first-order information of the multivariate polynomials and tensor decomposition is employed to obtain the parsimonious decoupled representation of the set of multivariate real polynomials estimated during the identification of NLSS model. Finally, the experimental verification of the model structure is done on the cascaded water-benchmark identification problem.
Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model.

PubMed

Snell, Kym I E; Hua, Harry; Debray, Thomas P A; Ensor, Joie; Look, Maxime P; Moons, Karel G M; Riley, Richard D

2016-01-01

Our aim was to improve meta-analysis methods for summarizing a prediction model's performance when individual participant data are available from multiple studies for external validation. We suggest multivariate meta-analysis for jointly synthesizing calibration and discrimination performance, while accounting for their correlation. The approach estimates a prediction model's average performance, the heterogeneity in performance across populations, and the probability of "good" performance in new populations. This allows different implementation strategies (e.g., recalibration) to be compared. Application is made to a diagnostic model for deep vein thrombosis (DVT) and a prognostic model for breast cancer mortality. In both examples, multivariate meta-analysis reveals that calibration performance is excellent on average but highly heterogeneous across populations unless the model's intercept (baseline hazard) is recalibrated. For the cancer model, the probability of "good" performance (defined by C statistic ≥0.7 and calibration slope between 0.9 and 1.1) in a new population was 0.67 with recalibration but 0.22 without recalibration. For the DVT model, even with recalibration, there was only a 0.03 probability of "good" performance. Multivariate meta-analysis can be used to externally validate a prediction model's calibration and discrimination performance across multiple populations and to evaluate different implementation strategies. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Multivariate Formation Pressure Prediction with Seismic-derived Petrophysical Properties from Prestack AVO inversion and Poststack Seismic Motion Inversion

NASA Astrophysics Data System (ADS)

Yu, H.; Gu, H.

2017-12-01

A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.
Time Series Model Identification by Estimating Information.

DTIC Science & Technology

1982-11-01

principle, Applications of Statistics, P. R. Krishnaiah , ed., North-Holland: Amsterdam, 27-41. Anderson, T. W. (1971). The Statistical Analysis of Time Series...E. (1969). Multiple Time Series Modeling, Multivariate Analysis II, edited by P. Krishnaiah , Academic Press: New York, 389-409. Parzen, E. (1981...Newton, H. J. (1980). Multiple Time Series Modeling, II Multivariate Analysis - V, edited by P. Krishnaiah , North Holland: Amsterdam, 181-197. Shibata, R
Determining the Relationship Between Moral Waivers and Marine Corps Unsuitability Attrition

DTIC Science & Technology

2008-03-01

observed characteristics. However, econometric research indicates that the magnitude of interaction effects estimated via probit or logit models may...1997 to 2005. Multivariate probit models were used to analyze the effects of moral waivers on unsatisfactory service separations. 15. NUMBER OF...files from fiscal years 1997 to 2005. Multivariate probit models were used to analyze the effects of moral waivers on unsatisfactory service

Use of metabolomics and lipidomics to evaluate the hypocholestreolemic effect of Proanthocyanidins from grape seed in a pig model.

PubMed

Quifer-Rada, Paola; Choy, Ying Yng; Calvert, Christopher C; Waterhouse, Andrew L; Lamuela-Raventos, Rosa M

2016-10-01

This work aims to evaluate changes in the fecal metabolomic profile due to grape seed extract (GSE) intake by untargeted and targeted analysis using high resolution mass spectrometry in conjunction with multivariate statistics. An intervention study with six crossbred female pigs was performed. The pigs followed a standard diet for 3 days, then they were fed with a supplemented diet containing 1% (w/w) of MegaNatural® Gold grape seed extract for 6 days. Fresh pig fecal samples were collected daily. A combination of untargeted high resolution mass spectrometry, multivariate analysis (PLS-DA), data-dependent MS/MS scan, and accurate mass database matching was used to measure the effect of the treatment on fecal composition. The resultant PLS-DA models showed a good discrimination among classes with great robustness and predictability. A total of 14 metabolites related to the GSE consumption were identified including biliary acid, dicarboxylic fatty acid, cholesterol metabolites, purine metabolites, and eicosanoid metabolites among others. Moreover, targeted metabolomics using GC-MS showed that cholesterol and its metabolites fecal excretion was increased due to the proanthocyanidins from grape seed extract. The results show that oligomeric procyanidins from GSE modifies bile acid and steroid excretion, which could exert a hypocholesterolemic effect. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Status quo of German-speaking medical students' attitudes toward and knowledge about central aspects of forensic psychiatry across four European countries.

PubMed

Warnke, Ingeborg; Gamma, Alex; Buadze, Anna; Schleifer, Roman; Canela, Carlos; Rüsch, Nicolas; Rössler, Wulf; Strebel, Bernd; Tényi, Tamás; Liebrenz, Michael

While forensic psychiatry is of increasing importance in mental health care, limited available evidence shows that attitudes toward the discipline are contradictory and that knowledge about it seems to be limited in medical students. We aimed to shed light on this subject by analyzing medical students' central attitudes toward and their association with knowledge about forensic psychiatry as well as with socio-demographic and education-specific predictor variables. We recruited N = 1345 medical students from 45 universities with a German language curriculum across four European countries (Germany, Switzerland, Austria and Hungary) by using an innovative approach, namely snowball sampling via Facebook. Students completed an online questionnaire, and data were analyzed descriptively and multivariably by linear mixed effects models and multinomial regression. The results showed overall neutral to positive attitudes toward forensic psychiatry, with indifferent attitudes toward the treatment of sex offenders, and forensic psychiatrists' expertise in the media. Whereas medical students knew about the term 'forensic psychiatry', they showed a lack of specific medico-legal knowledge. Multivariable models on predictor variables revealed statistically significant findings with, however, small estimates and variance explanation. Therefore, further research is required along with the development of a refined assessment instrument for medical students to explore both attitudes and knowledge in forensic psychiatry. Copyright © 2018 Elsevier Ltd. All rights reserved.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

NASA Astrophysics Data System (ADS)

Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong

2017-08-01

Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Using Multivariate Adaptive Regression Spline and Artificial Neural Network to Simulate Urbanization in Mumbai, India

NASA Astrophysics Data System (ADS)

Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.

2015-12-01

Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Selection Indices and Multivariate Analysis Show Similar Results in the Evaluation of Growth and Carcass Traits in Beef Cattle

PubMed Central

Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel

2016-01-01

This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection. PMID:26789008
Selection Indices and Multivariate Analysis Show Similar Results in the Evaluation of Growth and Carcass Traits in Beef Cattle.

PubMed

Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel

2016-01-01

This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection.
A statistical approach for segregating cognitive task stages from multivariate fMRI BOLD time series.

PubMed

Demanuele, Charmaine; Bähner, Florian; Plichta, Michael M; Kirsch, Peter; Tost, Heike; Meyer-Lindenberg, Andreas; Durstewitz, Daniel

2015-01-01

Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from functional magnetic resonance imaging (fMRI) blood oxygenation level dependent (BOLD) time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze (RAM) task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC), but not in the primary visual cortex (V1). Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel activity.
A Primer on Multivariate Analysis of Variance (MANOVA) for Behavioral Scientists

ERIC Educational Resources Information Center

Warne, Russell T.

2014-01-01

Reviews of statistical procedures (e.g., Bangert & Baumberger, 2005; Kieffer, Reese, & Thompson, 2001; Warne, Lazo, Ramos, & Ritter, 2012) show that one of the most common multivariate statistical methods in psychological research is multivariate analysis of variance (MANOVA). However, MANOVA and its associated procedures are often not…
An analytics approach to designing patient centered medical homes.

PubMed

Ajorlou, Saeede; Shams, Issac; Yang, Kai

2015-03-01

Recently the patient centered medical home (PCMH) model has become a popular team based approach focused on delivering more streamlined care to patients. In current practices of medical homes, a clinical based prediction frame is recommended because it can help match the portfolio capacity of PCMH teams with the actual load generated by a set of patients. Without such balances in clinical supply and demand, issues such as excessive under and over utilization of physicians, long waiting time for receiving the appropriate treatment, and non-continuity of care will eliminate many advantages of the medical home strategy. In this paper, by using the hierarchical generalized linear model with multivariate responses, we develop a clinical workload prediction model for care portfolio demands in a Bayesian framework. The model allows for heterogeneous variances and unstructured covariance matrices for nested random effects that arise through complex hierarchical care systems. We show that using a multivariate approach substantially enhances the precision of workload predictions at both primary and non primary care levels. We also demonstrate that care demands depend not only on patient demographics but also on other utilization factors, such as length of stay. Our analyses of a recent data from Veteran Health Administration further indicate that risk adjustment for patient health conditions can considerably improve the prediction power of the model.
Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

PubMed

Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

2012-07-01

Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Probabilistic estimates of drought impacts on agricultural production

NASA Astrophysics Data System (ADS)

Madadgar, Shahrbanou; AghaKouchak, Amir; Farahmand, Alireza; Davis, Steven J.

2017-08-01

Increases in the severity and frequency of drought in a warming climate may negatively impact agricultural production and food security. Unlike previous studies that have estimated agricultural impacts of climate condition using single-crop yield distributions, we develop a multivariate probabilistic model that uses projected climatic conditions (e.g., precipitation amount or soil moisture) throughout a growing season to estimate the probability distribution of crop yields. We demonstrate the model by an analysis of the historical period 1980-2012, including the Millennium Drought in Australia (2001-2009). We find that precipitation and soil moisture deficit in dry growing seasons reduced the average annual yield of the five largest crops in Australia (wheat, broad beans, canola, lupine, and barley) by 25-45% relative to the wet growing seasons. Our model can thus produce region- and crop-specific agricultural sensitivities to climate conditions and variability. Probabilistic estimates of yield may help decision-makers in government and business to quantitatively assess the vulnerability of agriculture to climate variations. We develop a multivariate probabilistic model that uses precipitation to estimate the probability distribution of crop yields. The proposed model shows how the probability distribution of crop yield changes in response to droughts. During Australia's Millennium Drought precipitation and soil moisture deficit reduced the average annual yield of the five largest crops.
Recognition and source memory as multivariate decision processes.

PubMed

Banks, W P

2000-07-01

Recognition memory, source memory, and exclusion performance are three important domains of study in memory, each with its own findings, it specific theoretical developments, and its separate research literature. It is proposed here that results from all three domains can be treated with a single analytic model. This article shows how to generate a comprehensive memory representation based on multidimensional signal detection theory and how to make predictions for each of these paradigms using decision axes drawn through the space. The detection model is simpler than the comparable multinomial model, it is more easily generalizable, and it does not make threshold assumptions. An experiment using the same memory set for all three tasks demonstrates the analysis and tests the model. The results show that some seemingly complex relations between the paradigms derive from an underlying simplicity of structure.
Generating level-dependent models of cervical and thoracic spinal cord injury: Exploring the interplay of neuroanatomy, physiology, and function.

PubMed

Wilcox, Jared T; Satkunendrarajah, Kajana; Nasirzadeh, Yasmin; Laliberte, Alex M; Lip, Alyssa; Cadotte, David W; Foltz, Warren D; Fehlings, Michael G

2017-09-01

The majority of spinal cord injuries (SCI) occur at the cervical level, which results in significant impairment. Neurologic level and severity of injury are primary endpoints in clinical trials; however, how level-specific damages relate to behavioural performance in cervical injury is incompletely understood. We hypothesized that ascending level of injury leads to worsening forelimb performance, and correlates with loss of neural tissue and muscle-specific neuron pools. A direct comparison of multiple models was made with injury realized at the C5, C6, C7 and T7 vertebral levels using clip compression with sham-operated controls. Animals were assessed for 10weeks post-injury with numerous (40) outcome measures, including: classic behavioural tests, CatWalk, non-invasive MRI, electrophysiology, histologic lesion morphometry, neuron counts, and motor compartment quantification, and multivariate statistics on the total dataset. Histologic staining and T1-weighted MR imaging revealed similar structural changes and distinct tissue loss with cystic cavitation across all injuries. Forelimb tests, including grip strength, F-WARP motor scale, Inclined Plane, and forelimb ladder walk, exhibited stratification between all groups and marked impairment with C5 and C6 injuries. Classic hindlimb tests including BBB, hindlimb ladder walk, bladder recovery, and mortality were not different between cervical and thoracic injuries. CatWalk multivariate gait analysis showed reciprocal and progressive changes forelimb and hindlimb function with ascending level of injury. Electrophysiology revealed poor forelimb axonal conduction in cervical C5 and C6 groups alone. The cervical enlargement (C5-T2) showed progressive ventral horn atrophy and loss of specific motor neuron populations with ascending injury. Multivariate statistics revealed a robust dataset, rank-order contribution of outcomes, and allowed prediction of injury level with single-level discrimination using forelimb performance and neuron counts. Level-dependent models were generated using clip-compression SCI, with marked and reliable differences in forelimb performance and specific neuron pool loss. Copyright © 2017 Elsevier Inc. All rights reserved.
A novel multi-variant epitope ensemble vaccine against avian leukosis virus subgroup J.

PubMed

Wang, Xiaoyu; Zhou, Defang; Wang, Guihua; Huang, Libo; Zheng, Qiankun; Li, Chengui; Cheng, Ziqiang

2017-12-04

The hypervariable antigenicity and immunosuppressive features of avian leukosis virus subgroup J (ALV-J) has led to great challenges to develop effective vaccines. Epitope vaccine will be a perspective trend. Previously, we identified a variant antigenic neutralizing epitope in hypervariable region 1 (hr1) of ALV-J, N-LRDFIA/E/TKWKS/GDDL/HLIRPYVNQS-C. BLAST analysis showed that the mutation of A, E, T and H in this epitope cover 79% of all ALV-J strains. Base on this data, we designed a multi-variant epitope ensemble vaccine comprising the four mutation variants linked with glycine and serine. The recombinant multi-variant epitope gene was expressed in Escherichia coli BL21. The expressed protein of the variant multi-variant epitope gene can react with positive sera and monoclonal antibodies of ALV-J, while cannot react with ALV-J negative sera. The multi-variant epitope vaccine that conjugated Freund's adjuvant complete/incomplete showed high immunogenicity that reached the titer of 1:64,000 at 42 days post immunization and maintained the immune period for at least 126 days in SPF chickens. Further, we demonstrated that the antibody induced by the variant multi-variant ensemble epitope vaccine recognized and neutralized different ALV-J strains (NX0101, TA1, WS1, BZ1224 and BZ4). Protection experiment that was evaluated by clinical symptom, viral shedding, weight gain, gross and histopathology showed 100% chickens that inoculated the multi-epitope vaccine were well protected against ALV-J challenge. The result shows a promising multi-variant epitope ensemble vaccine against hypervariable viruses in animals. Copyright © 2017 Elsevier Ltd. All rights reserved.
A General Multivariate Latent Growth Model with Applications to Student Achievement

ERIC Educational Resources Information Center

Bianconcini, Silvia; Cagnone, Silvia

2012-01-01

The evaluation of the formative process in the University system has been assuming an ever increasing importance in the European countries. Within this context, the analysis of student performance and capabilities plays a fundamental role. In this work, the authors propose a multivariate latent growth model for studying the performances of a…
Bayesian Estimation of Random Coefficient Dynamic Factor Models

ERIC Educational Resources Information Center

Song, Hairong; Ferrer, Emilio

2012-01-01

Dynamic factor models (DFMs) have typically been applied to multivariate time series data collected from a single unit of study, such as a single individual or dyad. The goal of DFMs application is to capture dynamics of multivariate systems. When multiple units are available, however, DFMs are not suited to capture variations in dynamics across…
Rotation in the Dynamic Factor Modeling of Multivariate Stationary Time Series.

ERIC Educational Resources Information Center

Molenaar, Peter C. M.; Nesselroade, John R.

2001-01-01

Proposes a special rotation procedure for the exploratory dynamic factor model for stationary multivariate time series. The rotation procedure applies separately to each univariate component series of a q-variate latent factor series and transforms such a component, initially represented as white noise, into a univariate moving-average.…
Modeling Associations among Multivariate Longitudinal Categorical Variables in Survey Data: A Semiparametric Bayesian Approach

ERIC Educational Resources Information Center

Tchumtchoua, Sylvie; Dey, Dipak K.

2012-01-01

This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
Parametric Cost Models for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip

2010-01-01

A study is in-process to develop a multivariable parametric cost model for space telescopes. Cost and engineering parametric data has been collected on 30 different space telescopes. Statistical correlations have been developed between 19 variables of 59 variables sampled. Single Variable and Multi-Variable Cost Estimating Relationships have been developed. Results are being published.
A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments

PubMed Central

Avalappampatty Sivasamy, Aneetha; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668

A Dynamic Intrusion Detection System Based on Multivariate Hotelling's T2 Statistics Approach for Network Environments.

PubMed

Sivasamy, Aneetha Avalappampatty; Sundan, Bose

2015-01-01

The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
The choice of prior distribution for a covariance matrix in multivariate meta-analysis: a simulation study.

PubMed

Hurtado Rúa, Sandra M; Mazumdar, Madhu; Strawderman, Robert L

2015-12-30

Bayesian meta-analysis is an increasingly important component of clinical research, with multivariate meta-analysis a promising tool for studies with multiple endpoints. Model assumptions, including the choice of priors, are crucial aspects of multivariate Bayesian meta-analysis (MBMA) models. In a given model, two different prior distributions can lead to different inferences about a particular parameter. A simulation study was performed in which the impact of families of prior distributions for the covariance matrix of a multivariate normal random effects MBMA model was analyzed. Inferences about effect sizes were not particularly sensitive to prior choice, but the related covariance estimates were. A few families of prior distributions with small relative biases, tight mean squared errors, and close to nominal coverage for the effect size estimates were identified. Our results demonstrate the need for sensitivity analysis and suggest some guidelines for choosing prior distributions in this class of problems. The MBMA models proposed here are illustrated in a small meta-analysis example from the periodontal field and a medium meta-analysis from the study of stroke. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
[Value of the albumin to globulin ratio in predicting severity and prognosis in myasthenia gravis patients].

PubMed

Yang, D H; Su, Z Q; Chen, Y; Chen, Z B; Ding, Z N; Weng, Y Y; Li, J; Li, X; Tong, Q L; Han, Y X; Zhang, X

2016-03-08

To assess the predictive value of the albumin to globulin ratio (AGR) in evaluation of disease severity and prognosis in myasthenia gravis patients. A total of 135 myasthenia gravis (MG) patients were enrolled between February 2009 and March 2015. The AGR was detected on the first day of hospitalization and ranked from lowest to highest, and the patients were divided into three equal tertiles according to the AGR values, which were T1 (AGR <1.34), T2 (1.34≤AGR≤1.53) and T3 (AGR>1.53). The Kaplan-Meier curve was used to evaluate the prognostic value of AGR. Cox model analysis was used to evaluate the relevant factors. Multivariate Logistic regression analysis was used to find the predictors of myasthenia crisis during hospitalization. The median length of hospital stay for each tertile was: for the T1 21 days (15-35.5), T2 18 days (14-27.5), and T3 16 days (12-22.5) (P<0.01), and Kaplan-Meier curves showed significant difference among the three groups. In the univariate model, serum albumin, creatinine, AGR and MGFA clinical classification were related to prognosis of myasthenia gravis. At the multivariate Cox regression analysis, the AGR (P<0.001) and MGFA clinical classification (P<0.001) were independent predictive factors of disease severity and prognosis in myasthenia gravis patients. Respectively, the hazard ratio (HR) were 4.655 (95% CI: 2.355-9.202) and 0.596 (95% CI: 0.492-0.723). Multivariate Logistic regression analysis showed the AGR (P<0.001) and MGFA clinical classification were related to myasthenia crisis. The AGR may represent a simple, potentially useful predictive biomarker for evaluating the disease severity and prognosis of patients with myasthenia gravis.
Comparison of univariate and multivariate calibration for the determination of micronutrients in pellets of plant materials by laser induced breakdown spectrometry

NASA Astrophysics Data System (ADS)

Braga, Jez Willian Batista; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Rufini, Iolanda Aparecida; Santos, Dário, Jr.; Krug, Francisco José

2010-01-01

The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.
Multivariate curve resolution based chromatographic peak alignment combined with parallel factor analysis to exploit second-order advantage in complex chromatographic measurements.

PubMed

Parastar, Hadi; Akvan, Nadia

2014-03-13

In the present contribution, a new combination of multivariate curve resolution-correlation optimized warping (MCR-COW) with trilinear parallel factor analysis (PARAFAC) is developed to exploit second-order advantage in complex chromatographic measurements. In MCR-COW, the complexity of the chromatographic data is reduced by arranging the data in a column-wise augmented matrix, analyzing using MCR bilinear model and aligning the resolved elution profiles using COW in a component-wise manner. The aligned chromatographic data is then decomposed using trilinear model of PARAFAC in order to exploit pure chromatographic and spectroscopic information. The performance of this strategy is evaluated using simulated and real high-performance liquid chromatography-diode array detection (HPLC-DAD) datasets. The obtained results showed that the MCR-COW can efficiently correct elution time shifts of target compounds that are completely overlapped by coeluted interferences in complex chromatographic data. In addition, the PARAFAC analysis of aligned chromatographic data has the advantage of unique decomposition of overlapped chromatographic peaks to identify and quantify the target compounds in the presence of interferences. Finally, to confirm the reliability of the proposed strategy, the performance of the MCR-COW-PARAFAC is compared with the frequently used methods of PARAFAC, COW-PARAFAC, multivariate curve resolution-alternating least squares (MCR-ALS), and MCR-COW-MCR. In general, in most of the cases the MCR-COW-PARAFAC showed an improvement in terms of lack of fit (LOF), relative error (RE) and spectral correlation coefficients in comparison to the PARAFAC, COW-PARAFAC, MCR-ALS and MCR-COW-MCR results. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariable Parametric Cost Model for Ground Optical Telescope Assembly

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia

2005-01-01

A parametric cost model for ground-based telescopes is developed using multivariable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction-limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature are examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e., multi-telescope phased-array systems). Additionally, single variable models Based on aperture diameter are derived.
A general framework for multivariate multi-index drought prediction based on Multivariate Ensemble Streamflow Prediction (MESP)

NASA Astrophysics Data System (ADS)

Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.

2016-08-01

Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources for early drought warning.
Family-Based Rare Variant Association Analysis: A Fast and Efficient Method of Multivariate Phenotype Association Analysis.

PubMed

Wang, Longfei; Lee, Sungyoung; Gim, Jungsoo; Qiao, Dandi; Cho, Michael; Elston, Robert C; Silverman, Edwin K; Won, Sungho

2016-09-01

Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows. © 2016 WILEY PERIODICALS, INC.
Effects of Flavor and Texture on the Sensory Perception of Gouda-Type Cheese Varieties during Ripening Using Multivariate Analysis.

PubMed

Shiota, Makoto; Iwasawa, Ai; Suzuki-Iwashima, Ai; Iida, Fumiko

2015-12-01

The impact of flavor composition, texture, and other factors on desirability of different commercial sources of Gouda-type cheese using multivariate analyses on the basis of sensory and instrumental analyses were investigated. Volatile aroma compounds were measured using headspace solid-phase microextraction gas chromatography/mass spectrometry (GC/MS) and steam distillation extraction (SDE)-GC/MS, and fatty acid composition, low-molecular-weight compounds, including amino acids, and organic acids, as well pH, texture, and color were measured to determine their relationship with sensory perception. Orthogonal partial least squares-discriminant analysis (OPLS-DA) was performed to discriminate between 2 different ripening periods in 7 sample sets, revealing that ethanol, ethyl acetate, hexanoic acid, and octanoic acid increased with increasing sensory attribute scores for sweetness, fruity, and sulfurous. A partial least squares (PLS) regression model was constructed to predict the desirability of cheese using these parameters. We showed that texture and buttery flavors are important factors affecting the desirability of Gouda-type cheeses for Japanese consumers using these multivariate analyses. © 2015 Institute of Food Technologists®
Two models for evaluating landslide hazards

USGS Publications Warehouse

Davis, J.C.; Chung, C.-J.; Ohlmacher, G.C.

2006-01-01

Two alternative procedures for estimating landslide hazards were evaluated using data on topographic digital elevation models (DEMs) and bedrock lithologies in an area adjacent to the Missouri River in Atchison County, Kansas, USA. The two procedures are based on the likelihood ratio model but utilize different assumptions. The empirical likelihood ratio model is based on non-parametric empirical univariate frequency distribution functions under an assumption of conditional independence while the multivariate logistic discriminant model assumes that likelihood ratios can be expressed in terms of logistic functions. The relative hazards of occurrence of landslides were estimated by an empirical likelihood ratio model and by multivariate logistic discriminant analysis. Predictor variables consisted of grids containing topographic elevations, slope angles, and slope aspects calculated from a 30-m DEM. An integer grid of coded bedrock lithologies taken from digitized geologic maps was also used as a predictor variable. Both statistical models yield relative estimates in the form of the proportion of total map area predicted to already contain or to be the site of future landslides. The stabilities of estimates were checked by cross-validation of results from random subsamples, using each of the two procedures. Cell-by-cell comparisons of hazard maps made by the two models show that the two sets of estimates are virtually identical. This suggests that the empirical likelihood ratio and the logistic discriminant analysis models are robust with respect to the conditional independent assumption and the logistic function assumption, respectively, and that either model can be used successfully to evaluate landslide hazards. ?? 2006.
Comparison of sensitivity to artificial spectral errors and multivariate LOD in NIR spectroscopy - Determining the performance of miniaturizations on melamine in milk powder.

PubMed

Henn, Raphael; Kirchler, Christian G; Grossgut, Maria-Elisabeth; Huck, Christian W

2017-05-01

This study compared three commercially available spectrometers - whereas two of them were miniaturized - in terms of prediction ability of melamine in milk powder (infant formula). Therefore all spectra were split into calibration- and validation-set using Kennard Stone and Duplex algorithm in comparison. For each instrument the three best performing PLSR models were constructed using SNV and Savitzky Golay derivatives. The best RMSEP values were 0.28g/100g, 0.33g/100g and 0.27g/100g for the NIRFlex N-500, the microPHAZIR and the microNIR2200 respectively. Furthermore the multivariate LOD interval [LOD min , LOD max ] was calculated for all the PLSR models unveiling significant differences among the spectrometers showing values of 0.20g/100g - 0.27g/100g, 0.28g/100g - 0.54g/100g and 0.44g/100g - 1.01g/100g for the NIRFlex N-500, the microPHAZIR and the microNIR2200 respectively. To assess the robustness of all models, artificial introduction of white noise, baseline shift, multiplicative effect, spectral shrink and stretch, stray light and spectral shift were applied. Monitoring the RMSEP as function of the perturbation gave indication of robustness of the models and helped to compare the performances of the spectrometers. Not taking the additional information from the LOD calculations into account one could falsely assume that all the spectrometers perform equally well which is not the case when the multivariate evaluation and robustness data were considered. Copyright Â© 2017 Elsevier B.V. All rights reserved.
A multivariate model exploring the predictive value of demographic, adolescent, and family factors on glycemic control in adolescents with type 1 diabetes.

PubMed

Agarwal, Shivani; Jawad, Abbas F; Miller, Victoria A

2016-11-01

The current study examined how a comprehensive set of variables from multiple domains, including at the adolescent and family level, were predictive of glycemic control in adolescents with type 1 diabetes (T1D). Participants included 100 adolescents with T1D ages 10-16 yrs and their parents. Participants were enrolled in a longitudinal study about youth decision-making involvement in chronic illness management of which the baseline data were available for analysis. Bivariate associations with glycemic control (HbA1C) were tested. Hierarchical linear regression was implemented to inform the predictive model. In bivariate analyses, race, family structure, household income, insulin regimen, adolescent-reported adherence to diabetes self-management, cognitive development, adolescent responsibility for T1D management, and parent behavior during the illness management discussion were associated with HbA1c. In the multivariate model, the only significant predictors of HbA1c were race and insulin regimen, accounting for 17% of the variance. Caucasians had better glycemic control than other racial groups. Participants using pre-mixed insulin therapy and basal-bolus insulin had worse glycemic control than those on insulin pumps. This study shows that despite associations of adolescent and family-level variables with glycemic control at the bivariate level, only race and insulin regimen are predictive of glycemic control in hierarchical multivariate analyses. This model offers an alternative way to examine the relationship of demographic and psychosocial factors on glycemic control in adolescents with T1D. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Faà di Bruno's formula and the distributions of random partitions in population genetics and physics.

PubMed

Hoppe, Fred M

2008-06-01

We show that the formula of Faà di Bruno for the derivative of a composite function gives, in special cases, the sampling distributions in population genetics that are due to Ewens and to Pitman. The composite function is the same in each case. Other sampling distributions also arise in this way, such as those arising from Dirichlet, multivariate hypergeometric, and multinomial models, special cases of which correspond to Bose-Einstein, Fermi-Dirac, and Maxwell-Boltzmann distributions in physics. Connections are made to compound sampling models.
Validation of the IHC4 Breast Cancer Prognostic Algorithm Using Multiple Approaches on the Multinational TEAM Clinical Trial.

PubMed

Bartlett, John M S; Christiansen, Jason; Gustavson, Mark; Rimm, David L; Piper, Tammy; van de Velde, Cornelis J H; Hasenburg, Annette; Kieback, Dirk G; Putter, Hein; Markopoulos, Christos J; Dirix, Luc Y; Seynaeve, Caroline; Rea, Daniel W

2016-01-01

Hormone receptors HER2/neu and Ki-67 are markers of residual risk in early breast cancer. An algorithm (IHC4) combining these markers may provide additional information on residual risk of recurrence in patients treated with hormone therapy. To independently validate the IHC4 algorithm in the multinational Tamoxifen Versus Exemestane Adjuvant Multicenter Trial (TEAM) cohort, originally developed on the trans-ATAC (Arimidex, Tamoxifen, Alone or in Combination Trial) cohort, by comparing 2 methodologies. The IHC4 biomarker expression was quantified on TEAM cohort samples (n = 2919) by using 2 independent methodologies (conventional 3,3'-diaminobezidine [DAB] immunohistochemistry with image analysis and standardized quantitative immunofluorescence [QIF] by AQUA technology). The IHC4 scores were calculated by using the same previously established coefficients and then compared with recurrence-free and distant recurrence-free survival, using multivariate Cox proportional hazards modeling. The QIF model was highly significant for prediction of residual risk (P < .001), with continuous model scores showing a hazard ratio (HR) of 1.012 (95% confidence interval [95% CI]: 1.010-1.014), which was significantly higher than that for the DAB model (HR: 1.008, 95% CI: 1.006-1.009); P < .001). Each model added significant prognostic value in addition to recognized clinical prognostic factors, including nodal status, in multivariate analyses. Quantitative immunofluorescence, however, showed more accuracy with respect to overall residual risk assessment than the DAB model. The use of the IHC4 algorithm was validated on the TEAM trial for predicting residual risk in patients with breast cancer. These data support the use of the IHC4 algorithm clinically, but quantitative and standardized approaches need to be used.
Space-time variation of respiratory cancers in South Carolina: a flexible multivariate mixture modeling approach to risk estimation.

PubMed

Carroll, Rachel; Lawson, Andrew B; Kirby, Russell S; Faes, Christel; Aregay, Mehreteab; Watjou, Kevin

2017-01-01

Many types of cancer have an underlying spatiotemporal distribution. Spatiotemporal mixture modeling can offer a flexible approach to risk estimation via the inclusion of latent variables. In this article, we examine the application and benefits of using four different spatiotemporal mixture modeling methods in the modeling of cancer of the lung and bronchus as well as "other" respiratory cancer incidences in the state of South Carolina. Of the methods tested, no single method outperforms the other methods; which method is best depends on the cancer under consideration. The lung and bronchus cancer incidence outcome is best described by the univariate modeling formulation, whereas the "other" respiratory cancer incidence outcome is best described by the multivariate modeling formulation. Spatiotemporal multivariate mixture methods can aid in the modeling of cancers with small and sparse incidences when including information from a related, more common type of cancer. Copyright © 2016 Elsevier Inc. All rights reserved.
The extension of total gain (TG) statistic in survival models: properties and applications.

PubMed

Choodari-Oskooei, Babak; Royston, Patrick; Parmar, Mahesh K B

2015-07-01

The results of multivariable regression models are usually summarized in the form of parameter estimates for the covariates, goodness-of-fit statistics, and the relevant p-values. These statistics do not inform us about whether covariate information will lead to any substantial improvement in prediction. Predictive ability measures can be used for this purpose since they provide important information about the practical significance of prognostic factors. R (2)-type indices are the most familiar forms of such measures in survival models, but they all have limitations and none is widely used. In this paper, we extend the total gain (TG) measure, proposed for a logistic regression model, to survival models and explore its properties using simulations and real data. TG is based on the binary regression quantile plot, otherwise known as the predictiveness curve. Standardised TG ranges from 0 (no explanatory power) to 1 ('perfect' explanatory power). The results of our simulations show that unlike many of the other R (2)-type predictive ability measures, TG is independent of random censoring. It increases as the effect of a covariate increases and can be applied to different types of survival models, including models with time-dependent covariate effects. We also apply TG to quantify the predictive ability of multivariable prognostic models developed in several disease areas. Overall, TG performs well in our simulation studies and can be recommended as a measure to quantify the predictive ability in survival models.
Multivariable Parametric Cost Model for Ground Optical: Telescope Assembly

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia

2004-01-01

A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature were examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter were derived.
Quantum attack-resistent certificateless multi-receiver signcryption scheme.

PubMed

Li, Huixian; Chen, Xubao; Pang, Liaojun; Shi, Weisong

2013-01-01

The existing certificateless signcryption schemes were designed mainly based on the traditional public key cryptography, in which the security relies on the hard problems, such as factor decomposition and discrete logarithm. However, these problems will be easily solved by the quantum computing. So the existing certificateless signcryption schemes are vulnerable to the quantum attack. Multivariate public key cryptography (MPKC), which can resist the quantum attack, is one of the alternative solutions to guarantee the security of communications in the post-quantum age. Motivated by these concerns, we proposed a new construction of the certificateless multi-receiver signcryption scheme (CLMSC) based on MPKC. The new scheme inherits the security of MPKC, which can withstand the quantum attack. Multivariate quadratic polynomial operations, which have lower computation complexity than bilinear pairing operations, are employed in signcrypting a message for a certain number of receivers in our scheme. Security analysis shows that our scheme is a secure MPKC-based scheme. We proved its security under the hardness of the Multivariate Quadratic (MQ) problem and its unforgeability under the Isomorphism of Polynomials (IP) assumption in the random oracle model. The analysis results show that our scheme also has the security properties of non-repudiation, perfect forward secrecy, perfect backward secrecy and public verifiability. Compared with the existing schemes in terms of computation complexity and ciphertext length, our scheme is more efficient, which makes it suitable for terminals with low computation capacity like smart cards.
Evaluating the predictive power of multivariate tensor-based morphometry in Alzheimer's disease progression via convex fused sparse group Lasso

NASA Astrophysics Data System (ADS)

Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha

2014-03-01

Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
A Multivariate Multilevel Approach to the Modeling of Accuracy and Speed of Test Takers

ERIC Educational Resources Information Center

Klein Entink, R. H.; Fox, J. P.; van der Linden, W. J.

2009-01-01

Response times on test items are easily collected in modern computerized testing. When collecting both (binary) responses and (continuous) response times on test items, it is possible to measure the accuracy and speed of test takers. To study the relationships between these two constructs, the model is extended with a multivariate multilevel…

Multivariate regression model for partitioning tree volume of white oak into round-product classes

Treesearch

Daniel A. Yaussy; David L. Sonderman

1984-01-01

Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples

ERIC Educational Resources Information Center

Avetisyan, Marianna; Fox, Jean-Paul

2012-01-01

In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms

ERIC Educational Resources Information Center

Anderson, John R.

2012-01-01

Multivariate pattern analysis can be combined with Hidden Markov Model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first "mind reading" application…
Four Families of Multi-Variant Issues in Graduate-Level Asynchronous Online Courses

ERIC Educational Resources Information Center

Gisburne, Jaclyn M.; Fairchild, Patricia J.

2004-01-01

This is the first of several papers developed from a faculty and student perspective describing a new distance learning (DL) model. Integral to the model are four interrelated families of multi-variant issues, referred to here as (a) the academic divide, (b) student misalignment, (c) administrative influences, and (d) the use of student…
Assessing Reliability of Student Ratings of Advisor: A Comparison of Univariate and Multivariate Generalizability Approaches.

ERIC Educational Resources Information Center

Sun, Anji; Valiga, Michael J.

In this study, the reliability of the American College Testing (ACT) Program's "Survey of Academic Advising" (SAA) was examined using both univariate and multivariate generalizability theory approaches. The primary purpose of the study was to compare the results of three generalizability theory models (a random univariate model, a mixed…
Web-Based Tools for Modelling and Analysis of Multivariate Data: California Ozone Pollution Activity

ERIC Educational Resources Information Center

Dinov, Ivo D.; Christou, Nicolas

2011-01-01

This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting…
Multivariate Generalizations of Student's t-Distribution. ONR Technical Report. [Biometric Lab Report No. 90-3.

ERIC Educational Resources Information Center

Gibbons, Robert D.; And Others

In the process of developing a conditionally-dependent item response theory (IRT) model, the problem arose of modeling an underlying multivariate normal (MVN) response process with general correlation among the items. Without the assumption of conditional independence, for which the underlying MVN cdf takes on comparatively simple forms and can be…
Bias and Precision of Measures of Association for a Fixed-Effect Multivariate Analysis of Variance Model

ERIC Educational Resources Information Center

Kim, Soyoung; Olejnik, Stephen

2005-01-01

The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…
Social Context and Dental Pain in Adults of Colombian Ethnic Minority Groups: A Multilevel Cross-Sectional Study.

PubMed

Ardila, Carlos M; Agudelo-Suárez, Andrés A

2016-01-01

To estimate the effect of social context on dental pain in adults of Colombian ethnic minority groups (CEGs). Information from 34,843 participants was used. A multilevel model was constructed that had ethnic groups (ie, CEGs and non-CEGs) at level 1 and Colombian states at level 2. Contextual variables included gross domestic product (GDP), Human Development Index (HDI), and Unmet Basic Needs Index (UBNI). Dental pain was observed in 12.3% of 6,440 CEGs. In an unadjusted logistic regression model, dental pain was associated with being a CEG (odds ratio [95% confidence interval], 1.34 [1.22-1.46]; P = .0001). This association remained significant after adjusting for possible confounding variables. An unconditional multilevel analysis showed that the variance in dental pain was statistically significant at the ethnic group level (β = 0.047 ± 0.015; P = .0009) and at the state level (β = 0.038 ± 0.019; P = .02) and that the variation between ethnic groups was higher than the variation between states (55% vs 45%, respectively). In a multivariate model, the variance in dental pain was also statistically significant at the ethnic group level (β = 0.029 ± 0.012; P = .007) and the state level (β = 0.042 ± .019; P = .01), but the variation between states was higher (40% vs 60%). The results of multilevel multivariate analyses showed that dental pain was associated with increasing age (β = 0.009 ± 0.001; P = .0001), lower education level (β = 0.302 ± 0.103; P = .0001), female sex (β = 0.031 ± 0.069; P = .003), GDP (β = 5.136 ± 2.009; P = .002) and HDI (β = 6.862 ± 5.550; P = .004); however, UBNI was not associated with dental pain. The variance in dental pain was higher between states than between ethnic groups in the multivariate multilevel model. Dental pain in CEGs was associated with contextual and individual factors. Considering contextual factors, GDP and HDI may play a major role in dental pain prevalence.
Time-Domain Nuclear Magnetic Resonance (TD-NMR) and Chemometrics for Determination of Fat Content in Commercial Products of Milk Powder.

PubMed

Nascimento, Paloma Andrade Martins; Barsanelli, Paulo Lopes; Rebellato, Ana Paula; Pallone, Juliana Azevedo Lima; Colnago, Luiz Alberto; Pereira, Fabíola Manhas Verbi

2017-03-01

This study shows the use of time-domain (TD)-NMR transverse relaxation (T2) data and chemometrics in the nondestructive determination of fat content for powdered food samples such as commercial dried milk products. Most proposed NMR spectroscopy methods for measuring fat content correlate free induction decay or echo intensities with the sample's mass. The need for the sample's mass limits the analytical frequency of NMR determination, because weighing the samples is an additional step in this procedure. Therefore, the method proposed here is based on a multivariate model of T2 decay, measured with Carr-Purcell-Meiboom-Gill pulse sequence and reference values of fat content. The TD-NMR spectroscopy method shows high correlation (r = 0.95) with the lipid content, determined by the standard extraction method of Bligh and Dyer. For comparison, fat content determination was also performed using a multivariate model with near-IR (NIR) spectroscopy, which is also a nondestructive method. The advantages of the proposed TD-NMR method are that it (1) minimizes toxic residue generation, (2) performs measurements with high analytical frequency (a few seconds per analysis), and (3) does not require sample preparation (such as pelleting, needed for NIR spectroscopy analyses) or weighing the samples.
Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

PubMed

Sztepanacz, Jacqueline L; Blows, Mark W

2017-07-01

The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdispersion of the leading eigenvalues of sample covariance matrices have been demonstrated to conform to the Tracy-Widom (TW) distribution. Here we show that genetic eigenvalues estimated using restricted maximum likelihood (REML) in a multivariate random effects model with an unconstrained genetic covariance structure will also conform to the TW distribution after empirical scaling and centering. However, where estimation procedures using either REML or MCMC impose boundary constraints, the resulting genetic eigenvalues tend not be TW distributed. We show how using confidence intervals from sampling distributions of genetic eigenvalues without reference to the TW distribution is insufficient protection against mistaking sampling error as genetic variance, particularly when eigenvalues are small. By scaling such sampling distributions to the appropriate TW distribution, the critical value of the TW statistic can be used to determine if the magnitude of a genetic eigenvalue exceeds the sampling error for each eigenvalue in the spectral distribution of a given genetic covariance matrix. Copyright © 2017 by the Genetics Society of America.
Exploratory multivariate modeling and prediction of the physico-chemical properties of surface water and groundwater

NASA Astrophysics Data System (ADS)

Ayoko, Godwin A.; Singh, Kirpal; Balerea, Steven; Kokot, Serge

2007-03-01

SummaryPhysico-chemical properties of surface water and groundwater samples from some developing countries have been subjected to multivariate analyses by the non-parametric multi-criteria decision-making methods, PROMETHEE and GAIA. Complete ranking information necessary to select one source of water in preference to all others was obtained, and this enabled relationships between the physico-chemical properties and water quality to be assessed. Thus, the ranking of the quality of the water bodies was found to be strongly dependent on the total dissolved solid, phosphate, sulfate, ammonia-nitrogen, calcium, iron, chloride, magnesium, zinc, nitrate and fluoride contents of the waters. However, potassium, manganese and zinc composition showed the least influence in differentiating the water bodies. To model and predict the water quality influencing parameters, partial least squares analyses were carried out on a matrix made up of the results of water quality assessment studies carried out in Nigeria, Papua New Guinea, Egypt, Thailand and India/Pakistan. The results showed that the total dissolved solid, calcium, sulfate, sodium and chloride contents can be used to predict a wide range of physico-chemical characteristics of water. The potential implications of these observations on the financial and opportunity costs associated with elaborate water quality monitoring are discussed.
Quantitative investigation of inappropriate regression model construction and the importance of medical statistics experts in observational medical research: a cross-sectional study.

PubMed

Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka

2018-05-05

To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation.

PubMed

Lee, Kyu Ha; Tadesse, Mahlet G; Baccarelli, Andrea A; Schwartz, Joel; Coull, Brent A

2017-03-01

The analysis of multiple outcomes is becoming increasingly common in modern biomedical studies. It is well-known that joint statistical models for multiple outcomes are more flexible and more powerful than fitting a separate model for each outcome; they yield more powerful tests of exposure or treatment effects by taking into account the dependence among outcomes and pooling evidence across outcomes. It is, however, unlikely that all outcomes are related to the same subset of covariates. Therefore, there is interest in identifying exposures or treatments associated with particular outcomes, which we term outcome-specific variable selection. In this work, we propose a variable selection approach for multivariate normal responses that incorporates not only information on the mean model, but also information on the variance-covariance structure of the outcomes. The approach effectively leverages evidence from all correlated outcomes to estimate the effect of a particular covariate on a given outcome. To implement this strategy, we develop a Bayesian method that builds a multivariate prior for the variable selection indicators based on the variance-covariance of the outcomes. We show via simulation that the proposed variable selection strategy can boost power to detect subtle effects without increasing the probability of false discoveries. We apply the approach to the Normative Aging Study (NAS) epigenetic data and identify a subset of five genes in the asthma pathway for which gene-specific DNA methylations are associated with exposures to either black carbon, a marker of traffic pollution, or sulfate, a marker of particles generated by power plants. © 2016, The International Biometric Society.
Modeling the High Speed Research Cycle 2B Longitudinal Aerodynamic Database Using Multivariate Orthogonal Functions

NASA Technical Reports Server (NTRS)

Morelli, E. A.; Proffitt, M. S.

1999-01-01

The data for longitudinal non-dimensional, aerodynamic coefficients in the High Speed Research Cycle 2B aerodynamic database were modeled using polynomial expressions identified with an orthogonal function modeling technique. The discrepancy between the tabular aerodynamic data and the polynomial models was tested and shown to be less than 15 percent for drag, lift, and pitching moment coefficients over the entire flight envelope. Most of this discrepancy was traced to smoothing local measurement noise and to the omission of mass case 5 data in the modeling process. A simulation check case showed that the polynomial models provided a compact and accurate representation of the nonlinear aerodynamic dependencies contained in the HSR Cycle 2B tabular aerodynamic database.
[Influence of sample surface roughness on mathematical model of NIR quantitative analysis of wood density].

PubMed

Huang, An-Min; Fei, Ben-Hua; Jiang, Ze-Hui; Hse, Chung-Yun

2007-09-01

Near infrared spectroscopy is widely used as a quantitative method, and the main multivariate techniques consist of regression methods used to build prediction models, however, the accuracy of analysis results will be affected by many factors. In the present paper, the influence of different sample roughness on the mathematical model of NIR quantitative analysis of wood density was studied. The result of experiments showed that if the roughness of predicted samples was consistent with that of calibrated samples, the result was good, otherwise the error would be much higher. The roughness-mixed model was more flexible and adaptable to different sample roughness. The prediction ability of the roughness-mixed model was much better than that of the single-roughness model.
MULTIVARIATE ANALYSES (CONONICAL CORRELATION AND PARTIAL LEAST SQUARE, PLS) TO MODEL AND ASSESS THE ASSOCIATION OF LANDSCAPE METRICS TO SURFACE WATER CHEMICAL AND BIOLOGICAL PROPERTIES USING SAVANNAH RIVER BASIN DATA.

EPA Science Inventory

Many multivariate methods are used in describing and predicting relation; each has its unique usage of categorical and non-categorical data. In multivariate analysis of variance (MANOVA), many response variables (y's) are related to many independent variables that are categorical...
Post-processing of multi-hydrologic model simulations for improved streamflow projections

NASA Astrophysics Data System (ADS)

khajehei, sepideh; Ahmadalipour, Ali; Moradkhani, Hamid

2016-04-01

Hydrologic model outputs are prone to bias and uncertainty due to knowledge deficiency in model and data. Uncertainty in hydroclimatic projections arises due to uncertainty in hydrologic model as well as the epistemic or aleatory uncertainties in GCM parameterization and development. This study is conducted to: 1) evaluate the recently developed multi-variate post-processing method for historical simulations and 2) assess the effect of post-processing on uncertainty and reliability of future streamflow projections in both high-flow and low-flow conditions. The first objective is performed for historical period of 1970-1999. Future streamflow projections are generated for 10 statistically downscaled GCMs from two widely used downscaling methods: Bias Corrected Statistically Downscaled (BCSD) and Multivariate Adaptive Constructed Analogs (MACA), over the period of 2010-2099 for two representative concentration pathways of RCP4.5 and RCP8.5. Three semi-distributed hydrologic models were employed and calibrated at 1/16 degree latitude-longitude resolution for over 100 points across the Columbia River Basin (CRB) in the pacific northwest USA. Streamflow outputs are post-processed through a Bayesian framework based on copula functions. The post-processing approach is relying on a transfer function developed based on bivariate joint distribution between the observation and simulation in historical period. Results show that application of post-processing technique leads to considerably higher accuracy in historical simulations and also reducing model uncertainty in future streamflow projections.
Determination of Leaf Water Content by Visible and Near-Infrared Spectrometry and Multivariate Calibration in Miscanthus

DOE PAGES

Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon; ...

2017-05-19

Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
Combinations of NIR, Raman spectroscopy and physicochemical measurements for improved monitoring of solvent extraction processes using hierarchical multivariate analysis models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nee, K.; Bryan, S.; Levitskaia, T.

The reliability of chemical processes can be greatly improved by implementing inline monitoring systems. Combining multivariate analysis with non-destructive sensors can enhance the process without interfering with the operation. Here, we present here hierarchical models using both principal component analysis and partial least square analysis developed for different chemical components representative of solvent extraction process streams. A training set of 380 samples and an external validation set of 95 samples were prepared and Near infrared and Raman spectral data as well as conductivity under variable temperature conditions were collected. The results from the models indicate that careful selection of themore » spectral range is important. By compressing the data through Principal Component Analysis (PCA), we lower the rank of the data set to its most dominant features while maintaining the key principal components to be used in the regression analysis. Within the studied data set, concentration of five chemical components were modeled; total nitrate (NO 3 -), total acid (H +), neodymium (Nd 3+), sodium (Na +), and ionic strength (I.S.). The best overall model prediction for each of the species studied used a combined data set comprised of complementary techniques including NIR, Raman, and conductivity. Finally, our study shows that chemometric models are powerful but requires significant amount of carefully analyzed data to capture variations in the chemistry.« less

Determination of Leaf Water Content by Visible and Near-Infrared Spectrometry and Multivariate Calibration in Miscanthus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon

Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
Combinations of NIR, Raman spectroscopy and physicochemical measurements for improved monitoring of solvent extraction processes using hierarchical multivariate analysis models

DOE PAGES

Nee, K.; Bryan, S.; Levitskaia, T.; ...

2017-12-28

The reliability of chemical processes can be greatly improved by implementing inline monitoring systems. Combining multivariate analysis with non-destructive sensors can enhance the process without interfering with the operation. Here, we present here hierarchical models using both principal component analysis and partial least square analysis developed for different chemical components representative of solvent extraction process streams. A training set of 380 samples and an external validation set of 95 samples were prepared and Near infrared and Raman spectral data as well as conductivity under variable temperature conditions were collected. The results from the models indicate that careful selection of themore » spectral range is important. By compressing the data through Principal Component Analysis (PCA), we lower the rank of the data set to its most dominant features while maintaining the key principal components to be used in the regression analysis. Within the studied data set, concentration of five chemical components were modeled; total nitrate (NO 3 -), total acid (H +), neodymium (Nd 3+), sodium (Na +), and ionic strength (I.S.). The best overall model prediction for each of the species studied used a combined data set comprised of complementary techniques including NIR, Raman, and conductivity. Finally, our study shows that chemometric models are powerful but requires significant amount of carefully analyzed data to capture variations in the chemistry.« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

USGS Publications Warehouse

Anderson, Ryan; Clegg, Samuel M.; Frydenvang, Jens; Wiens, Roger C.; McLennan, Scott M.; Morris, Richard V.; Ehlmann, Bethany L.; Dyar, M. Darby

2017-01-01

Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.
Toward the Multivariate Modeling of Achievement, Aptitude, and Personality.

ERIC Educational Resources Information Center

Foshay, Wellesley R.; Misanchuk, Earl R.

1981-01-01

A multivariate investigation of the dynamics of cumulative achievement studied the influence of course grades, personality traits, environmental variables, and previous performance. The latter was the best single predictor of performance. (CJ)
Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.

PubMed

Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao

2016-11-30

Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.
Multitrophic functional diversity predicts ecosystem functioning in experimental assemblages of estuarine consumers.

PubMed

Lefcheck, Jonathan S; Duffy, J Emmett

2015-11-01

The use of functional traits to explain how biodiversity affects ecosystem functioning has attracted intense interest, yet few studies have a priori altered functional diversity, especially in multitrophic communities. Here, we manipulated multivariate functional diversity of estuarine grazers and predators within multiple levels of species richness to test how species richness and functional diversity predicted ecosystem functioning in a multitrophic food web. Community functional diversity was a better predictor than species richness for the majority of ecosystem properties, based on generalized linear mixed-effects models. Combining inferences from eight traits into a single multivariate index increased prediction accuracy of these models relative to any individual trait. Structural equation modeling revealed that functional diversity of both grazers and predators was important in driving final biomass within trophic levels, with stronger effects observed for predators. We also show that different species drove different ecosystem responses, with evidence for both sampling effects and complementarity. Our study extends experimental investigations of functional trait diversity to a multilevel food web, and demonstrates that functional diversity can be more accurate and effective than species richness in predicting community biomass in a food web context.
A review of multivariate methods in brain imaging data fusion

NASA Astrophysics Data System (ADS)

Sui, Jing; Adali, Tülay; Li, Yi-Ou; Yang, Honghui; Calhoun, Vince D.

2010-03-01

On joint analysis of multi-task brain imaging data sets, a variety of multivariate methods have shown their strengths and been applied to achieve different purposes based on their respective assumptions. In this paper, we provide a comprehensive review on optimization assumptions of six data fusion models, including 1) four blind methods: joint independent component analysis (jICA), multimodal canonical correlation analysis (mCCA), CCA on blind source separation (sCCA) and partial least squares (PLS); 2) two semi-blind methods: parallel ICA and coefficient-constrained ICA (CC-ICA). We also propose a novel model for joint blind source separation (BSS) of two datasets using a combination of sCCA and jICA, i.e., 'CCA+ICA', which, compared with other joint BSS methods, can achieve higher decomposition accuracy as well as the correct automatic source link. Applications of the proposed model to real multitask fMRI data are compared to joint ICA and mCCA; CCA+ICA further shows its advantages in capturing both shared and distinct information, differentiating groups, and interpreting duration of illness in schizophrenia patients, hence promising applicability to a wide variety of medical imaging problems.
Development of Raman microspectroscopy for automated detection and imaging of basal cell carcinoma

NASA Astrophysics Data System (ADS)

Larraona-Puy, Marta; Ghita, Adrian; Zoladek, Alina; Perkins, William; Varma, Sandeep; Leach, Iain H.; Koloydenko, Alexey A.; Williams, Hywel; Notingher, Ioan

2009-09-01

We investigate the potential of Raman microspectroscopy (RMS) for automated evaluation of excised skin tissue during Mohs micrographic surgery (MMS). The main aim is to develop an automated method for imaging and diagnosis of basal cell carcinoma (BCC) regions. Selected Raman bands responsible for the largest spectral differences between BCC and normal skin regions and linear discriminant analysis (LDA) are used to build a multivariate supervised classification model. The model is based on 329 Raman spectra measured on skin tissue obtained from 20 patients. BCC is discriminated from healthy tissue with 90+/-9% sensitivity and 85+/-9% specificity in a 70% to 30% split cross-validation algorithm. This multivariate model is then applied on tissue sections from new patients to image tumor regions. The RMS images show excellent correlation with the gold standard of histopathology sections, BCC being detected in all positive sections. We demonstrate the potential of RMS as an automated objective method for tumor evaluation during MMS. The replacement of current histopathology during MMS by a ``generalization'' of the proposed technique may improve the feasibility and efficacy of MMS, leading to a wider use according to clinical need.
Modelling and Optimization of Polycaprolactone Ultrafine-Fibres Electrospinning Process Using Response Surface Methodology

PubMed Central

Ruys, Andrew J.

2018-01-01

Electrospun fibres have gained broad interest in biomedical applications, including tissue engineering scaffolds, due to their potential in mimicking extracellular matrix and producing structures favourable for cell and tissue growth. The development of scaffolds often involves multivariate production parameters and multiple output characteristics to define product quality. In this study on electrospinning of polycaprolactone (PCL), response surface methodology (RSM) was applied to investigate the determining parameters and find optimal settings to achieve the desired properties of fibrous scaffold for acetabular labrum implant. The results showed that solution concentration influenced fibre diameter, while elastic modulus was determined by solution concentration, flow rate, temperature, collector rotation speed, and interaction between concentration and temperature. Relationships between these variables and outputs were modelled, followed by an optimization procedure. Using the optimized setting (solution concentration of 10% w/v, flow rate of 4.5 mL/h, temperature of 45 °C, and collector rotation speed of 1500 RPM), a target elastic modulus of 25 MPa could be achieved at a minimum possible fibre diameter (1.39 ± 0.20 µm). This work demonstrated that multivariate factors of production parameters and multiple responses can be investigated, modelled, and optimized using RSM. PMID:29562614
The study of combining Latin Hypercube Sampling method and LU decomposition method (LULHS method) for constructing spatial random field

NASA Astrophysics Data System (ADS)

WANG, P. T.

2015-12-01

Groundwater modeling requires to assign hydrogeological properties to every numerical grid. Due to the lack of detailed information and the inherent spatial heterogeneity, geological properties can be treated as random variables. Hydrogeological property is assumed to be a multivariate distribution with spatial correlations. By sampling random numbers from a given statistical distribution and assigning a value to each grid, a random field for modeling can be completed. Therefore, statistics sampling plays an important role in the efficiency of modeling procedure. Latin Hypercube Sampling (LHS) is a stratified random sampling procedure that provides an efficient way to sample variables from their multivariate distributions. This study combines the the stratified random procedure from LHS and the simulation by using LU decomposition to form LULHS. Both conditional and unconditional simulations of LULHS were develpoed. The simulation efficiency and spatial correlation of LULHS are compared to the other three different simulation methods. The results show that for the conditional simulation and unconditional simulation, LULHS method is more efficient in terms of computational effort. Less realizations are required to achieve the required statistical accuracy and spatial correlation.
The microbiological profile and presence of bloodstream infection influence mortality rates in necrotizing fasciitis

PubMed Central

2011-01-01

Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053
Bayesian transformation cure frailty models with multivariate failure time data.

PubMed

Yin, Guosheng

2008-12-10

We propose a class of transformation cure frailty models to accommodate a survival fraction in multivariate failure time data. Established through a general power transformation, this family of cure frailty models includes the proportional hazards and the proportional odds modeling structures as two special cases. Within the Bayesian paradigm, we obtain the joint posterior distribution and the corresponding full conditional distributions of the model parameters for the implementation of Gibbs sampling. Model selection is based on the conditional predictive ordinate statistic and deviance information criterion. As an illustration, we apply the proposed method to a real data set from dentistry.
Understanding Activity Engagement Across Weekdays and Weekend Days: A Multivariate Multiple Discrete-Continuous Modeling Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garikapati, Venu; Astroza, Sebastian; Bhat, Prerna C.

This paper is motivated by the increasing recognition that modeling activity-travel demand for a single day of the week, as is done in virtually all travel forecasting models, may be inadequate in capturing underlying processes that govern activity-travel scheduling behavior. The considerable variability in daily travel suggests that there are important complementary relationships and competing tradeoffs involved in scheduling and allocating time to various activities across days of the week. Both limited survey data availability and methodological challenges in modeling week-long activity-travel schedules have precluded the development of multi-day activity-travel demand models. With passive and technology-based data collection methods increasinglymore » in vogue, the collection of multi-day travel data may become increasingly commonplace in the years ahead. This paper addresses the methodological challenge associated with modeling multi-day activity-travel demand by formulating a multivariate multiple discrete-continuous probit (MDCP) model system. The comprehensive framework ties together two MDCP model components, one corresponding to weekday time allocation and the other to weekend activity-time allocation. By tying the two MDCP components together, the model system also captures relationships in activity-time allocation between weekdays on the one hand and weekend days on the other. Model estimation on a week-long travel diary data set from the United Kingdom shows that there are significant inter-relationships between weekdays and weekend days in activity-travel scheduling behavior. The model system presented in this paper may serve as a higher-level multi-day activity scheduler in conjunction with existing daily activity-based travel models.« less
Chemotherapy effectiveness and mortality prediction in surgically treated osteosarcoma dogs: A validation study.

PubMed

Schmidt, A F; Nielen, M; Withrow, S J; Selmic, L E; Burton, J H; Klungel, O H; Groenwold, R H H; Kirpensteijn, J

2016-03-01

Canine osteosarcoma is the most common bone cancer, and an important cause of mortality and morbidity, in large purebred dogs. Previously we constructed two multivariable models to predict a dog's 5-month or 1-year mortality risk after surgical treatment for osteosarcoma. According to the 5-month model, dogs with a relatively low risk of 5-month mortality benefited most from additional chemotherapy treatment. In the present study, we externally validated these results using an independent cohort study of 794 dogs. External performance of our prediction models showed some disagreement between observed and predicted risk, mean difference: -0.11 (95% confidence interval [95% CI]-0.29; 0.08) for 5-month risk and 0.25 (95%CI 0.10; 0.40) for 1-year mortality risk. After updating the intercept, agreement improved: -0.0004 (95%CI-0.16; 0.16) and -0.002 (95%CI-0.15; 0.15). The chemotherapy by predicted mortality risk interaction (P-value=0.01) showed that the chemotherapy compared to no chemotherapy effectiveness was modified by 5-month mortality risk: dogs with a relatively lower risk of mortality benefited most from additional chemotherapy. Chemotherapy effectiveness on 1-year mortality was not significantly modified by predicted risk (P-value=0.28). In conclusion, this external validation study confirmed that our multivariable risk prediction models can predict a patient's mortality risk and that dogs with a relatively lower risk of 5-month mortality seem to benefit most from chemotherapy. Copyright © 2016 Elsevier B.V. All rights reserved.
Nomogram Prediction of Overall Survival After Curative Irradiation for Uterine Cervical Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Seo, YoungSeok; Yoo, Seong Yul; Kim, Mi-Sook

Purpose: The purpose of this study was to develop a nomogram capable of predicting the probability of 5-year survival after radical radiotherapy (RT) without chemotherapy for uterine cervical cancer. Methods and Materials: We retrospectively analyzed 549 patients that underwent radical RT for uterine cervical cancer between March 1994 and April 2002 at our institution. Multivariate analysis using Cox proportional hazards regression was performed and this Cox model was used as the basis for the devised nomogram. The model was internally validated for discrimination and calibration by bootstrap resampling. Results: By multivariate regression analysis, the model showed that age, hemoglobin levelmore » before RT, Federation Internationale de Gynecologie Obstetrique (FIGO) stage, maximal tumor diameter, lymph node status, and RT dose at Point A significantly predicted overall survival. The survival prediction model demonstrated good calibration and discrimination. The bootstrap-corrected concordance index was 0.67. The predictive ability of the nomogram proved to be superior to FIGO stage (p = 0.01). Conclusions: The devised nomogram offers a significantly better level of discrimination than the FIGO staging system. In particular, it improves predictions of survival probability and could be useful for counseling patients, choosing treatment modalities and schedules, and designing clinical trials. However, before this nomogram is used clinically, it should be externally validated.« less
A New Multivariate Approach for Prognostics Based on Extreme Learning Machine and Fuzzy Clustering.

PubMed

Javed, Kamran; Gouriveau, Rafael; Zerhouni, Noureddine

2015-12-01

Prognostics is a core process of prognostics and health management (PHM) discipline, that estimates the remaining useful life (RUL) of a degrading machinery to optimize its service delivery potential. However, machinery operates in a dynamic environment and the acquired condition monitoring data are usually noisy and subject to a high level of uncertainty/unpredictability, which complicates prognostics. The complexity further increases, when there is absence of prior knowledge about ground truth (or failure definition). For such issues, data-driven prognostics can be a valuable solution without deep understanding of system physics. This paper contributes a new data-driven prognostics approach namely, an "enhanced multivariate degradation modeling," which enables modeling degrading states of machinery without assuming a homogeneous pattern. In brief, a predictability scheme is introduced to reduce the dimensionality of the data. Following that, the proposed prognostics model is achieved by integrating two new algorithms namely, the summation wavelet-extreme learning machine and subtractive-maximum entropy fuzzy clustering to show evolution of machine degradation by simultaneous predictions and discrete state estimation. The prognostics model is equipped with a dynamic failure threshold assignment procedure to estimate RUL in a realistic manner. To validate the proposition, a case study is performed on turbofan engines data from PHM challenge 2008 (NASA), and results are compared with recent publications.
Genetic correlations between wellbeing, depression and anxiety symptoms and behavioral responses to the emotional faces task in healthy twins.

PubMed

Routledge, Kylie M; Williams, Leanne M; Harris, Anthony W F; Schofield, Peter R; Clark, C Richard; Gatt, Justine M

2018-06-01

Currently there is a very limited understanding of how mental wellbeing versus anxiety and depression symptoms are associated with emotion processing behaviour. For the first time, we examined these associations using a behavioural emotion task of positive and negative facial expressions in 1668 healthy adult twins. Linear mixed model results suggested faster reaction times to happy facial expressions was associated with higher wellbeing scores, and slower reaction times with higher depression and anxiety scores. Multivariate twin modelling identified a significant genetic correlation between depression and anxiety symptoms and reaction time to happy facial expressions, in the absence of any significant correlations with wellbeing. We also found a significant negative phenotypic relationship between depression and anxiety symptoms and accuracy for identifying neutral emotions, although the genetic or environment correlations were not significant in the multivariate model. Overall, the phenotypic relationships between speed of identifying happy facial expressions and wellbeing on the one hand, versus depression and anxiety symptoms on the other, were in opposing directions. Twin modelling revealed a small common genetic correlation between response to happy faces and depression and anxiety symptoms alone, suggesting that wellbeing and depression and anxiety symptoms show largely independent relationships with emotion processing at the behavioral level. Copyright © 2018 Elsevier B.V. All rights reserved.
Differential flatness properties and multivariable adaptive control of ovarian system dynamics

NASA Astrophysics Data System (ADS)

Rigatos, Gerasimos

2016-12-01

The ovarian system exhibits nonlinear dynamics which is modeled by a set of coupled nonlinear differential equations. The paper proposes adaptive fuzzy control based on differential flatness theory for the complex dynamics of the ovarian system. It is proven that the dynamic model of the ovarian system, having as state variables the LH and the FSH hormones and their derivatives, is a differentially flat one. This means that all its state variables and its control inputs can be described as differential functions of the flat output. By exploiting differential flatness properties the system's dynamic model is written in the multivariable linear canonical (Brunovsky) form, for which the design of a state feedback controller becomes possible. After this transformation, the new control inputs of the system contain unknown nonlinear parts, which are identified with the use of neurofuzzy approximators. The learning procedure for these estimators is determined by the requirement the first derivative of the closed-loop's Lyapunov function to be a negative one. Moreover, Lyapunov stability analysis shows that H-infinity tracking performance is succeeded for the feedback control loop and this assures improved robustness to the aforementioned model uncertainty as well as to external perturbations. The efficiency of the proposed adaptive fuzzy control scheme is confirmed through simulation experiments.
Non-targeted 1H NMR fingerprinting and multivariate statistical analyses for the characterisation of the geographical origin of Italian sweet cherries.

PubMed

Longobardi, F; Ventrella, A; Bianco, A; Catucci, L; Cafagna, I; Gallo, V; Mastrorilli, P; Agostiano, A

2013-12-01

In this study, non-targeted (1)H NMR fingerprinting was used in combination with multivariate statistical techniques for the classification of Italian sweet cherries based on their different geographical origins (Emilia Romagna and Puglia). As classification techniques, Soft Independent Modelling of Class Analogy (SIMCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Linear Discriminant Analysis (LDA) were carried out and the results were compared. For LDA, before performing a refined selection of the number/combination of variables, two different strategies for a preliminary reduction of the variable number were tested. The best average recognition and CV prediction abilities (both 100.0%) were obtained for all the LDA models, although PLS-DA also showed remarkable performances (94.6%). All the statistical models were validated by observing the prediction abilities with respect to an external set of cherry samples. The best result (94.9%) was obtained with LDA by performing a best subset selection procedure on a set of 30 principal components previously selected by a stepwise decorrelation. The metabolites that mostly contributed to the classification performances of such LDA model, were found to be malate, glucose, fructose, glutamine and succinate. Copyright © 2013 Elsevier Ltd. All rights reserved.
The classification of secondary colorectal liver cancer in human biopsy samples using angular dispersive x-ray diffraction and multivariate analysis

NASA Astrophysics Data System (ADS)

Theodorakou, Chrysoula; Farquharson, Michael J.

2009-08-01

The motivation behind this study is to assess whether angular dispersive x-ray diffraction (ADXRD) data, processed using multivariate analysis techniques, can be used for classifying secondary colorectal liver cancer tissue and normal surrounding liver tissue in human liver biopsy samples. The ADXRD profiles from a total of 60 samples of normal liver tissue and colorectal liver metastases were measured using a synchrotron radiation source. The data were analysed for 56 samples using nonlinear peak-fitting software. Four peaks were fitted to all of the ADXRD profiles, and the amplitude, area, amplitude and area ratios for three of the four peaks were calculated and used for the statistical and multivariate analysis. The statistical analysis showed that there are significant differences between all the peak-fitting parameters and ratios between the normal and the diseased tissue groups. The technique of soft independent modelling of class analogy (SIMCA) was used to classify normal liver tissue and colorectal liver metastases resulting in 67% of the normal tissue samples and 60% of the secondary colorectal liver tissue samples being classified correctly. This study has shown that the ADXRD data of normal and secondary colorectal liver cancer are statistically different and x-ray diffraction data analysed using multivariate analysis have the potential to be used as a method of tissue classification.

Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network

DOE PAGES

Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong; ...

2017-12-18

Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
A simple prognostic model for overall survival in metastatic renal cell carcinoma.

PubMed

Assi, Hazem I; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony

2016-01-01

The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis.
A simple prognostic model for overall survival in metastatic renal cell carcinoma

PubMed Central

Assi, Hazem I.; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony

2016-01-01

Introduction: The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. Methods: We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. Results: There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. Conclusions: In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis. PMID:27217858
Multivariate exploration of non-intrusive load monitoring via spatiotemporal pattern network

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Chao; Akintayo, Adedotun; Jiang, Zhanhong

Non-intrusive load monitoring (NILM) of electrical demand for the purpose of identifying load components has thus far mostly been studied using univariate data, e.g., using only whole building electricity consumption time series to identify a certain type of end-use such as lighting load. However, using additional variables in the form of multivariate time series data may provide more information in terms of extracting distinguishable features in the context of energy disaggregation. In this work, a novel probabilistic graphical modeling approach, namely the spatiotemporal pattern network (STPN) is proposed for energy disaggregation using multivariate time-series data. The STPN framework is shownmore » to be capable of handling diverse types of multivariate time-series to improve the energy disaggregation performance. The technique outperforms the state of the art factorial hidden Markov models (FHMM) and combinatorial optimization (CO) techniques in multiple real-life test cases. Furthermore, based on two homes' aggregate electric consumption data, a similarity metric is defined for the energy disaggregation of one home using a trained model based on the other home (i.e., out-of-sample case). The proposed similarity metric allows us to enhance scalability via learning supervised models for a few homes and deploying such models to many other similar but unmodeled homes with significantly high disaggregation accuracy.« less
A High-Dimensional, Multivariate Copula Approach to Modeling Multivariate Agricultural Price Relationships and Tail Dependencies

Treesearch

Xuan Chi; Barry Goodwin

2012-01-01

Spatial and temporal relationships among agricultural prices have been an important topic of applied research for many years. Such research is used to investigate the performance of markets and to examine linkages up and down the marketing chain. This research has empirically evaluated price linkages by using correlation and regression models and, later, linear and...
Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water

USDA-ARS?s Scientific Manuscript database

Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Copula-based prediction of economic movements

NASA Astrophysics Data System (ADS)

García, J. E.; González-López, V. A.; Hirsh, I. D.

2016-06-01

In this paper we model the discretized returns of two paired time series BM&FBOVESPA Dividend Index and BM&FBOVESPA Public Utilities Index using multivariate Markov models. The discretization corresponds to three categories, high losses, high profits and the complementary periods of the series. In technical terms, the maximal memory that can be considered for a Markov model, can be derived from the size of the alphabet and dataset. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination, of the partitions corresponding to the two marginal processes and the partition corresponding to the multivariate Markov chain. In order to estimate the transition probabilities, all the partitions are linked using a copula. In our application this strategy provides a significant improvement in the movement predictions.
Cross-country transferability of multi-variable damage models

NASA Astrophysics Data System (ADS)

Wagenaar, Dennis; Lüdtke, Stefan; Kreibich, Heidi; Bouwer, Laurens

2017-04-01

Flood damage assessment is often done with simple damage curves based only on flood water depth. Additionally, damage models are often transferred in space and time, e.g. from region to region or from one flood event to another. Validation has shown that depth-damage curve estimates are associated with high uncertainties, particularly when applied in regions outside the area where the data for curve development was collected. Recently, progress has been made with multi-variable damage models created with data-mining techniques, i.e. Bayesian Networks and random forest. However, it is still unknown to what extent and under which conditions model transfers are possible and reliable. Model validations in different countries will provide valuable insights into the transferability of multi-variable damage models. In this study we compare multi-variable models developed on basis of flood damage datasets from Germany as well as from The Netherlands. Data from several German floods was collected using computer aided telephone interviews. Data from the 1993 Meuse flood in the Netherlands is available, based on compensations paid by the government. The Bayesian network and random forest based models are applied and validated in both countries on basis of the individual datasets. A major challenge was the harmonization of the variables between both datasets due to factors like differences in variable definitions, and regional and temporal differences in flood hazard and exposure characteristics. Results of model validations and comparisons in both countries are discussed, particularly in respect to encountered challenges and possible solutions for an improvement of model transferability.
Multivariate Prediction Equations for HbA1c Lowering, Weight Change, and Hypoglycemic Events Associated with Insulin Rescue Medication in Type 2 Diabetes Mellitus: Informing Economic Modeling.

PubMed

Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt

2017-03-01

Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Pretreatment Nomogram to Predict the Risk of Acute Urinary Retention After I-125 Prostate Brachytherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roeloffzen, Ellen M., E-mail: e.m.a.roeloffzen@umcutrecht.nl; Vulpen, Marco van; Battermann, Jan J.

Purpose: Acute urinary retention (AUR) after iodine-125 (I-125) prostate brachytherapy negatively influences long-term quality of life and therefore should be prevented. We aimed to develop a nomogram to preoperatively predict the risk of AUR. Methods: Using the preoperative data of 714 consecutive patients who underwent I-125 prostate brachytherapy between 2005 and 2008 at our department, we modeled the probability of AUR. Multivariate logistic regression analysis was used to assess the predictive ability of a set of pretreatment predictors and the additional value of a new risk factor (the extent of prostate protrusion into the bladder). The performance of the finalmore » model was assessed with calibration and discrimination measures. Results: Of the 714 patients, 57 patients (8.0%) developed AUR after implantation. Multivariate analysis showed that the combination of prostate volume, IPSS score, neoadjuvant hormonal treatment and the extent of prostate protrusion contribute to the prediction of AUR. The discriminative value (receiver operator characteristic area, ROC) of the basic model (including prostate volume, International Prostate Symptom Score, and neoadjuvant hormonal treatment) to predict the development of AUR was 0.70. The addition of prostate protrusion significantly increased the discriminative power of the model (ROC 0.82). Calibration of this final model was good. The nomogram showed that among patients with a low sum score (<18 points), the risk of AUR was only 0%-5%. However, in patients with a high sum score (>35 points), the risk of AUR was more than 20%. Conclusion: This nomogram is a useful tool for physicians to predict the risk of AUR after I-125 prostate brachytherapy. The nomogram can aid in individualized treatment decision-making and patient counseling.« less
Perspectives of family medicine physicians on the importance of adolescent preventive care: a multivariate analysis.

PubMed

Taylor, Jaime L; Aalsma, Matthew C; Gilbert, Amy L; Hensel, Devon J; Rickert, Vaughn I

2016-01-20

The study objective was to identify commonalities amongst family medicine physicians who endorse annual adolescent visits. A nationally weighted representative on-line survey was used to explore pediatrician (N = 204) and family medicine physicians (N = 221) beliefs and behaviors surrounding adolescent wellness. Our primary outcome was endorsement that adolescents should receive annual preventive care visits. Pediatricians were significantly more likely (p < .01) to endorse annual well visits. Among family medicine physicians, bivariate comparisons were conducted between those who endorsed an annual visit (N = 164) compared to those who did not (N = 57) with significant predictors combined into two multivariate logistic regression models. Model 1 controlled for: patient race, proportion of 13-17 year olds in provider's practice, discussion beliefs scale and discussion behaviors with parents scale. Model 2 controlled for the same first three variables as well as discussion behaviors with adolescents scale. Model 1 showed for each discussion beliefs scale topic selected, family medicine physicians had 1.14 increased odds of endorsing annual visits (p < .001) and had 1.11 greater odds of endorsing annual visits with each one-point increase in discussion behaviors with parents scale (p = .51). Model 2 showed for each discussion beliefs scale topic selected, family medicine physicians had 1.15 increased odds of also endorsing the importance of annual visits (p < .001). Family medicine physicians that endorse annual visits are significantly more likely to affirm they hold strong beliefs about topics that should be discussed during the annual exam. They also act on these beliefs by talking to parents of teens about these topics. This group appears to focus on quality of care in thought and deed.
TH-E-BRF-03: A Multivariate Interaction Model for Assessment of Hippocampal Vascular Dose-Response and Early Prediction of Radiation-Induced Neurocognitive Dysfunction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Farjam, R; Pramanik, P; Srinivasan, A

Purpose: Vascular injury could be a cause of hippocampal dysfunction leading to late neurocognitive decline in patients receiving brain radiotherapy (RT). Hence, our aim was to develop a multivariate interaction model for characterization of hippocampal vascular dose-response and early prediction of radiation-induced late neurocognitive impairments. Methods: 27 patients (17 males and 10 females, age 31–80 years) were enrolled in an IRB-approved prospective longitudinal study. All patients were diagnosed with a low-grade glioma or benign tumor and treated by 3-D conformal or intensity-modulated RT with a median dose of 54 Gy (50.4–59.4 Gy in 1.8− Gy fractions). Six DCE-MRI scans weremore » performed from pre-RT to 18 months post-RT. DCE data were fitted to the modified Toft model to obtain the transfer constant of gadolinium influx from the intravascular space into the extravascular extracellular space, Ktrans, and the fraction of blood plasma volume, Vp. The hippocampus vascular property alterations after starting RT were characterized by changes in the hippocampal mean values of, μh(Ktrans)τ and μh(Vp)τ. The dose-response, Δμh(Ktrans/Vp)pre->τ, was modeled using a multivariate linear regression considering integrations of doses with age, sex, hippocampal laterality and presence of tumor/edema near a hippocampus. Finally, the early vascular dose-response in hippocampus was correlated with neurocognitive decline 6 and 18 months post-RT. Results: The μh(Ktrans) increased significantly from pre-RT to 1 month post-RT (p<0.0004). The multivariate model showed that the dose effect on Δμh(Ktrans)pre->1M post-RT was interacted with sex (p<0.0007) and age (p<0.00004), with the dose-response more pronounced in older females. Also, the vascular dose-response in the left hippocampus of females was significantly correlated with memory function decline at 6 (r = − 0.95, p<0.0006) and 18 (r = −0.88, p<0.02) months post-RT. Conclusion: The hippocampal vascular response to radiation could be sex and age dependent. The early hippocampal vascular dose-response could predict late neurocognitive dysfunction. (Support: NIH-RO1NS064973)« less
A Web-based nomogram predicting para-aortic nodal metastasis in incompletely staged patients with endometrial cancer: a Korean Multicenter Study.

PubMed

Kang, Sokbom; Lee, Jong-Min; Lee, Jae-Kwan; Kim, Jae-Weon; Cho, Chi-Heum; Kim, Seok-Mo; Park, Sang-Yoon; Park, Chan-Yong; Kim, Ki-Tae

2014-03-01

The purpose of this study is to develop a Web-based nomogram for predicting the individualized risk of para-aortic nodal metastasis in incompletely staged patients with endometrial cancer. From 8 institutions, the medical records of 397 patients who underwent pelvic and para-aortic lymphadenectomy as a surgical staging procedure were retrospectively reviewed. A multivariate logistic regression model was created and internally validated by rigorous bootstrap resampling methods. Finally, the model was transformed into a user-friendly Web-based nomogram (http://http://www.kgog.org/nomogram/empa001.html). The rate of para-aortic nodal metastasis was 14.4% (57/397 patients). Using a stepwise variable selection, 4 variables including deep myometrial invasion, non-endometrioid subtype, lymphovascular space invasion, and log-transformed CA-125 levels were finally adopted. After 1000 repetitions of bootstrapping, all of these 4 variables retained a significant association with para-aortic nodal metastasis in the multivariate analysis-deep myometrial invasion (P = 0.001), non-endometrioid histologic subtype (P = 0.034), lymphovascular space invasion (P = 0.003), and log-transformed serum CA-125 levels (P = 0.004). The model showed good discrimination (C statistics = 0.87; 95% confidence interval, 0.82-0.92) and accurate calibration (Hosmer-Lemeshow P = 0.74). This nomogram showed good performance in predicting para-aortic metastasis in patients with endometrial cancer. The tool may be useful in determining the extent of lymphadenectomy after incomplete surgery.
Diagnosing perforated appendicitis in pediatric patients: a new model.

PubMed

van den Bogaard, Veerle A B; Euser, Sjoerd M; van der Ploeg, Tjeerd; de Korte, Niels; Sanders, Dave G M; de Winter, Derek; Vergroesen, Diederik; van Groningen, Krijn; de Winter, Peter

2016-03-01

Studies have investigated sensitivity and specificity of symptoms and tests for diagnosing appendicitis in children. Less is known with regard to the predictive value of these symptoms and tests with respect to the severity of appendicitis. The aim of this study was to determine the predictive value of patient's characteristics and tests for discriminating between perforated and nonperforated appendicitis in children. Pediatric patients who underwent an appendectomy at Spaarne Hospital Hoofddorp, the Netherlands, between January 1, 2009 and December 31, 2013, were included. Baseline patient's characteristics, history, physical examination, laboratory data and results of ultrasounds were collected. Univariate and multivariate logistic regressions were used to determine predictors of perforation. In total, 375 patients were included in this study of which 97 children (25.9%) had significant signs of perforation. Univariate analysis showed that age, duration of complaints, temperature, vomiting, CRP, WBC, different findings on ultrasound and the diameter of the appendix were good predictors of a perforated appendicitis. The final multivariate prediction model included temperature, CRP, clearly visible appendix and free fluids on ultrasound and diameter of the appendix and resulted in an area under the curve (AUC) of 0.91 showing sensitivity and specificity of respectively 85.2% and 81.2%. This prediction model can be used for identification of 'high-risk' children for a perforated appendicitis and might be helpful to prevent complications and longer hospitalization by bringing these children to theater earlier. Copyright © 2016 Elsevier Inc. All rights reserved.
A Machine Learning Approach to Automated Gait Analysis for the Noldus Catwalk System.

PubMed

Frohlich, Holger; Claes, Kasper; De Wolf, Catherine; Van Damme, Xavier; Michel, Anne

2018-05-01

Gait analysis of animal disease models can provide valuable insights into in vivo compound effects and thus help in preclinical drug development. The purpose of this paper is to establish a computational gait analysis approach for the Noldus Catwalk system, in which footprints are automatically captured and stored. We present a - to our knowledge - first machine learning based approach for the Catwalk system, which comprises a step decomposition, definition and extraction of meaningful features, multivariate step sequence alignment, feature selection, and training of different classifiers (gradient boosting machine, random forest, and elastic net). Using animal-wise leave-one-out cross validation we demonstrate that with our method we can reliable separate movement patterns of a putative Parkinson's disease animal model and several control groups. Furthermore, we show that we can predict the time point after and the type of different brain lesions and can even forecast the brain region, where the intervention was applied. We provide an in-depth analysis of the features involved into our classifiers via statistical techniques for model interpretation. A machine learning method for automated analysis of data from the Noldus Catwalk system was established. Our works shows the ability of machine learning to discriminate pharmacologically relevant animal groups based on their walking behavior in a multivariate manner. Further interesting aspects of the approach include the ability to learn from past experiments, improve with more data arriving and to make predictions for single animals in future studies.
A multivariate mixed model system for wood specific gravity and moisture content of planted loblolly pine stands in the southern United States

Treesearch

Finto Antony; Laurence R. Schimleck; Alex Clark; Richard F. Daniels

2012-01-01

Specific gravity (SG) and moisture content (MC) both have a strong influence on the quantity and quality of wood fiber. We proposed a multivariate mixed model system to model the two properties simultaneously. Disk SG and MC at different height levels were measured from 3 trees in 135 stands across the natural range of loblolly pine and the stand level values were used...
Attitudes and exercise adherence: test of the Theories of Reasoned Action and Planned Behaviour.

PubMed

Smith, R A; Biddle, S J

1999-04-01

Three studies of exercise adherence and attitudes are reported that tested the Theory of Reasoned Action and the Theory of Planned Behaviour. In a prospective study of adherence to a private fitness club, structural equation modelling path analysis showed that attitudinal and social normative components of the Theory of Reasoned Action accounted for 13.1% of the variance in adherence 4 months later, although only social norm significantly predicted intention. In a second study, the Theory of Planned Behaviour was used to predict both physical activity and sedentary behaviour. Path analyses showed that attitude and perceived control, but not social norm, predicted total physical activity. Physical activity was predicted from intentions and control over sedentary behaviour. Finally, an intervention study with previously sedentary adults showed that intentions to be active measured at the start and end of a 10-week intervention were associated with the planned behaviour variables. A multivariate analysis of variance revealed no significant multivariate effects for time on the planned behaviour variables measured before and after intervention. Qualitative data provided evidence that participants had a positive experience on the intervention programme and supported the role of social normative factors in the adherence process.
Modeling and managing risk early in software development

NASA Technical Reports Server (NTRS)

Briand, Lionel C.; Thomas, William M.; Hetmanski, Christopher J.

1993-01-01

In order to improve the quality of the software development process, we need to be able to build empirical multivariate models based on data collectable early in the software process. These models need to be both useful for prediction and easy to interpret, so that remedial actions may be taken in order to control and optimize the development process. We present an automated modeling technique which can be used as an alternative to regression techniques. We show how it can be used to facilitate the identification and aid the interpretation of the significant trends which characterize 'high risk' components in several Ada systems. Finally, we evaluate the effectiveness of our technique based on a comparison with logistic regression based models.
About the dark and bright sides of self-efficacy: workaholism and work engagement.

PubMed

Del Líbano, Mario; Llorens, Susana; Salanoval, Marisa; Schaufeli, Wilmar B

2012-07-01

Taking the Resources-Experiences-Demands Model (RED Model) by Salanova and colleagues as our starting point, we tested how work self-efficacy relates positively to negative (i.e., work overload and work-family conflict) and positive outcomes (i.e., job satisfaction and organizational commitment), through the mediating role of workaholism (health impairment process) and work engagement (motivational process). In a sample of 386 administrative staff from a Spanish University (65% women), Structural Equation Modeling provided full evidence for the research model. In addition, Multivariate Analyses of Variance showed that self-efficacy was only related positively to one of the two dimensions of workaholism, namely, working excessively. Finally, we discuss the theoretical and practical contributions in terms of the RED Model.
Multivariate curve resolution-alternating least squares and kinetic modeling applied to near-infrared data from curing reactions of epoxy resins: mechanistic approach and estimation of kinetic rate constants.

PubMed

Garrido, M; Larrechi, M S; Rius, F X

2006-02-01

This study describes the combination of multivariate curve resolution-alternating least squares with a kinetic modeling strategy for obtaining the kinetic rate constants of a curing reaction of epoxy resins. The reaction between phenyl glycidyl ether and aniline is monitored by near-infrared spectroscopy under isothermal conditions for several initial molar ratios of the reagents. The data for all experiments, arranged in a column-wise augmented data matrix, are analyzed using multivariate curve resolution-alternating least squares. The concentration profiles recovered are fitted to a chemical model proposed for the reaction. The selection of the kinetic model is assisted by the information contained in the recovered concentration profiles. The nonlinear fitting provides the kinetic rate constants. The optimized rate constants are in agreement with values reported in the literature.

The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

PubMed

Warton, David I; Thibaut, Loïc; Wang, Yi Alice

2017-01-01

Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
The PIT-trap—A “model-free” bootstrap procedure for inference about regression models with discrete, multivariate responses

PubMed Central

Thibaut, Loïc; Wang, Yi Alice

2017-01-01

Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
Synthesis of a control model for a liquid nitrogen cooled, closed circuit, cryogenic nitrogen wind tunnel and its validation

NASA Technical Reports Server (NTRS)

Balakrishna, S.; Goglia, G. L.

1979-01-01

The details of the efforts to synthesize a control-compatible multivariable model of a liquid nitrogen cooled, gaseous nitrogen operated, closed circuit, cryogenic pressure tunnel are presented. The synthesized model was transformed into a real-time cryogenic tunnel simulator, and this model is validated by comparing the model responses to the actual tunnel responses of the 0.3 m transonic cryogenic tunnel, using the quasi-steady-state and the transient responses of the model and the tunnel. The global nature of the simple, explicit, lumped multivariable model of a closed circuit cryogenic tunnel is demonstrated.
Creation of mortality risk charts using 123I meta-iodobenzylguanidine heart-to-mediastinum ratio in patients with heart failure: 2- and 5-year risk models.

PubMed

Nakajima, Kenichi; Nakata, Tomoaki; Matsuo, Shinro; Jacobson, Arnold F

2016-10-01

(123)I meta-iodobenzylguanidine (MIBG) imaging has been extensively used for prognostication in patients with chronic heart failure (CHF). The purpose of this study was to create mortality risk charts for short-term (2 years) and long-term (5 years) prediction of cardiac mortality. Using a pooled database of 1322 CHF patients, multivariate analysis, including (123)I-MIBG late heart-to-mediastinum ratio (HMR), left ventricular ejection fraction (LVEF), and clinical factors, was performed to determine optimal variables for the prediction of 2- and 5-year mortality risk using subsets of the patients (n = 1280 and 933, respectively). Multivariate logistic regression analysis was performed to create risk charts. Cardiac mortality was 10 and 22% for the sub-population of 2- and 5-year analyses. A four-parameter multivariate logistic regression model including age, New York Heart Association (NYHA) functional class, LVEF, and HMR was used. Annualized mortality rate was <1% in patients with NYHA Class I-II and HMR ≥ 2.0, irrespective of age and LVEF. In patients with NYHA Class III-IV, mortality rate was 4-6 times higher for HMR < 1.40 compared with HMR ≥ 2.0 in all LVEF classes. Among the subset of patients with b-type natriuretic peptide (BNP) results (n = 491 and 359 for 2- and 5-year models, respectively), the 5-year model showed incremental value of HMR in addition to BNP. Both 2- and 5-year risk prediction models with (123)I-MIBG HMR can be used to identify low-risk as well as high-risk patients, which can be effective for further risk stratification of CHF patients even when BNP is available. © The Author 2015. Published by Oxford University Press on behalf of the European Society of Cardiology.
Predictors of psychiatric readmission among patients with bipolar disorder at an academic safety-net hospital.

PubMed

Hamilton, Jane E; Passos, Ives C; de Azevedo Cardoso, Taiane; Jansen, Karen; Allen, Melissa; Begley, Charles E; Soares, Jair C; Kapczinski, Flavio

2016-06-01

Even with treatment, approximately one-third of patients with bipolar disorder relapse into depression or mania within 1 year. Unfavorable clinical outcomes for patients with bipolar disorder include increased rates of psychiatric hospitalization and functional impairment. However, only a few studies have examined predictors of psychiatric hospital readmission in a sample of patients with bipolar disorder. The purpose of this study was to examine predictors of psychiatric readmission within 30 days, 90 days and 1 year of discharge among patients with bipolar disorder using a conceptual model adapted from Andersen's Behavioral Model of Health Service Use. In this retrospective study, univariate and multivariate logistic regression analyses were conducted in a sample of 2443 adult patients with bipolar disorder who were consecutively admitted to a public psychiatric hospital in the United States from 1 January to 31 December 2013. In the multivariate models, several enabling and need factors were significantly associated with an increased risk of readmission across all time periods examined, including being uninsured, having ⩾3 psychiatric hospitalizations and having a lower Global Assessment of Functioning score. Additional factors associated with psychiatric readmission within 30 and 90 days of discharge included patient homelessness. Patient race/ethnicity, bipolar disorder type or a current manic episode did not significantly predict readmission across all time periods examined; however, patients who were male were more likely to readmit within 1 year. The 30-day and 1-year multivariate models showed the best model fit. Our study found enabling and need factors to be the strongest predictors of psychiatric readmission, suggesting that the prevention of psychiatric readmission for patients with bipolar disorder at safety-net hospitals may be best achieved by developing and implementing innovative transitional care initiatives that address the issues of multiple psychiatric hospitalizations, housing instability, insurance coverage and functional impairment. © The Royal Australian and New Zealand College of Psychiatrists 2015.
Forecasting daily source air quality using multivariate statistical analysis and radial basis function networks.

PubMed

Sun, Gang; Hoff, Steven J; Zelle, Brian C; Nelson, Minda A

2008-12-01

It is vital to forecast gas and particle matter concentrations and emission rates (GPCER) from livestock production facilities to assess the impact of airborne pollutants on human health, ecological environment, and global warming. Modeling source air quality is a complex process because of abundant nonlinear interactions between GPCER and other factors. The objective of this study was to introduce statistical methods and radial basis function (RBF) neural network to predict daily source air quality in Iowa swine deep-pit finishing buildings. The results show that four variables (outdoor and indoor temperature, animal units, and ventilation rates) were identified as relative important model inputs using statistical methods. It can be further demonstrated that only two factors, the environment factor and the animal factor, were capable of explaining more than 94% of the total variability after performing principal component analysis. The introduction of fewer uncorrelated variables to the neural network would result in the reduction of the model structure complexity, minimize computation cost, and eliminate model overfitting problems. The obtained results of RBF network prediction were in good agreement with the actual measurements, with values of the correlation coefficient between 0.741 and 0.995 and very low values of systemic performance indexes for all the models. The good results indicated the RBF network could be trained to model these highly nonlinear relationships. Thus, the RBF neural network technology combined with multivariate statistical methods is a promising tool for air pollutant emissions modeling.
The genetic basis for cognitive ability, memory, and depression symptomatology in middle-aged and elderly chinese twins.

PubMed

Xu, Chunsheng; Sun, Jianping; Ji, Fuling; Tian, Xiaocao; Duan, Haiping; Zhai, Yaoming; Wang, Shaojie; Pang, Zengchang; Zhang, Dongfeng; Zhao, Zhongtang; Li, Shuxia; Hjelmborg, Jacob V B; Christensen, Kaare; Tan, Qihua

2015-02-01

The genetic influences on aging-related phenotypes, including cognition and depression, have been well confirmed in the Western populations. We performed the first twin-based analysis on cognitive performance, memory and depression status in middle-aged and elderly Chinese twins, representing the world's largest and most rapidly aging population. The sample consisted of 384 twin pairs with a median age of 50 years. Cognitive function was measured using the Montreal Cognitive Assessment (MoCA) scale; memory was assessed using the revised Wechsler Adult Intelligence scale; depression symptomatology was evaluated by the self-reported 30-item Geriatric Depression (GDS-30)scale. Both univariate and multivariate twin models were fitted to the three phenotypes with full and nested models and compared to select the best fitting models. Univariate analysis showed moderate-to-high genetic influences with heritability 0.44 for cognition and 0.56 for memory. Multivariate analysis by the reduced Cholesky model estimated significant genetic (rG = 0.69) and unique environmental (rE = 0.25) correlation between cognitive ability and memory. The model also estimated weak but significant inverse genetic correlation for depression with cognition (-0.31) and memory (-0.28). No significant unique environmental correlation was found for depression with other two phenotypes. In conclusion, there can be a common genetic architecture for cognitive ability and memory that weakly correlates with depression symptomatology, but in the opposite direction.
Spatial land-use inventory, modeling, and projection/Denver metropolitan area, with inputs from existing maps, airphotos, and LANDSAT imagery

NASA Technical Reports Server (NTRS)

Tom, C.; Miller, L. D.; Christenson, J. W.

1978-01-01

A landscape model was constructed with 34 land-use, physiographic, socioeconomic, and transportation maps. A simple Markov land-use trend model was constructed from observed rates of change and nonchange from photointerpreted 1963 and 1970 airphotos. Seven multivariate land-use projection models predicting 1970 spatial land-use changes achieved accuracies from 42 to 57 percent. A final modeling strategy was designed, which combines both Markov trend and multivariate spatial projection processes. Landsat-1 image preprocessing included geometric rectification/resampling, spectral-band, and band/insolation ratioing operations. A new, systematic grid-sampled point training-set approach proved to be useful when tested on the four orginal MSS bands, ten image bands and ratios, and all 48 image and map variables (less land use). Ten variable accuracy was raised over 15 percentage points from 38.4 to 53.9 percent, with the use of the 31 ancillary variables. A land-use classification map was produced with an optimal ten-channel subset of four image bands and six ancillary map variables. Point-by-point verification of 331,776 points against a 1972/1973 U.S. Geological Survey (UGSG) land-use map prepared with airphotos and the same classification scheme showed average first-, second-, and third-order accuracies of 76.3, 58.4, and 33.0 percent, respectively.
A Regularized Linear Dynamical System Framework for Multivariate Time Series Analysis.

PubMed

Liu, Zitao; Hauskrecht, Milos

2015-01-01

Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning Multivariate Time Series (MTS). However, in general, it is difficult to set the dimension of an LDS's hidden state space. A small number of hidden states may not be able to model the complexities of a MTS, while a large number of hidden states can lead to overfitting. In this paper, we study learning methods that impose various regularization penalties on the transition matrix of the LDS model and propose a regularized LDS learning framework (rLDS) which aims to (1) automatically shut down LDSs' spurious and unnecessary dimensions, and consequently, address the problem of choosing the optimal number of hidden states; (2) prevent the overfitting problem given a small amount of MTS data; and (3) support accurate MTS forecasting. To learn the regularized LDS from data we incorporate a second order cone program and a generalized gradient descent method into the Maximum a Posteriori framework and use Expectation Maximization to obtain a low-rank transition matrix of the LDS model. We propose two priors for modeling the matrix which lead to two instances of our rLDS. We show that our rLDS is able to recover well the intrinsic dimensionality of the time series dynamics and it improves the predictive performance when compared to baselines on both synthetic and real-world MTS datasets.
The effect of physician and health plan market concentration on prices in commercial health insurance markets.

PubMed

Schneider, John E; Li, Pengxiang; Klepser, Donald G; Peterson, N Andrew; Brown, Timothy T; Scheffler, Richard M

2008-03-01

The objective of this paper is to describe the market structure of health plans (HPs) and physician organizations (POs) in California, a state with high levels of managed care penetration and selective contracting. First we calculate Herfindahl-Hirschman (HHI) concentration indices for HPs and POs in 42 California counties. We then estimate a multivariable regression model to examine the relationship between concentration measures and the prices paid by HPs to POs. Price data is from Medstat MarketScan databases. The findings show that any California counties exhibit what the Department of Justice would consider high HHI concentration measures, in excess of 1,800. More than three quarters of California counties exhibit HP concentration indices over 1,800, and 83% of counties have PO concentration levels in excess of 1,800. Half of the study counties exhibited PO concentration levels in excess of 3,600, compared to only 24% for plans. Multivariate price models suggest that PO concentration is associated with higher physician prices (p < or = 0.05), whereas HP concentration does not appear to be significantly associated with higher outpatient commercial payer prices.
Alignment-Independent Comparisons of Human Gastrointestinal Tract Microbial Communities in a Multidimensional 16S rRNA Gene Evolutionary Space▿

PubMed Central

Rudi, Knut; Zimonja, Monika; Kvenshagen, Bente; Rugtveit, Jarle; Midtvedt, Tore; Eggesbø, Merete

2007-01-01

We present a novel approach for comparing 16S rRNA gene clone libraries that is independent of both DNA sequence alignment and definition of bacterial phylogroups. These steps are the major bottlenecks in current microbial comparative analyses. We used direct comparisons of taxon density distributions in an absolute evolutionary coordinate space. The coordinate space was generated by using alignment-independent bilinear multivariate modeling. Statistical analyses for clone library comparisons were based on multivariate analysis of variance, partial least-squares regression, and permutations. Clone libraries from both adult and infant gastrointestinal tract microbial communities were used as biological models. We reanalyzed a library consisting of 11,831 clones covering complete colons from three healthy adults in addition to a smaller 390-clone library from infant feces. We show that it is possible to extract detailed information about microbial community structures using our alignment-independent method. Our density distribution analysis is also very efficient with respect to computer operation time, meeting the future requirements of large-scale screenings to understand the diversity and dynamics of microbial communities. PMID:17337554
Effect of abdominopelvic abscess drain size on drainage time and probability of occlusion.

PubMed

Rotman, Jessica A; Getrajdman, George I; Maybody, Majid; Erinjeri, Joseph P; Yarmohammadi, Hooman; Sofocleous, Constantinos T; Solomon, Stephen B; Boas, F Edward

2017-04-01

The purpose of this study is to determine whether larger abdominopelvic abscess drains reduce the time required for abscess resolution or the probability of tube occlusion. 144 consecutive patients who underwent abscess drainage at a single institution were reviewed retrospectively. Larger initial drain size did not reduce drainage time, drain occlusion, or drain exchanges (P > .05). Subgroup analysis did not find any type of collection that benefitted from larger drains. A multivariate model predicting drainage time showed that large collections (>200 mL) required 16 days longer drainage time than small collections (<50 mL). Collections with a fistula to bowel required 17 days longer drainage time than collections without a fistula. Initial drain size and the viscosity of the fluid in the collection had no significant effect on drainage time in the multivariate model. 8 F drains are adequate for initial drainage of most serous and serosanguineous collections. 10 F drains are adequate for initial drainage of most purulent or bloody collections. Copyright © 2016 Elsevier Inc. All rights reserved.
An Investigation of Multivariate Adaptive Regression Splines for Modeling and Analysis of Univariate and Semi-Multivariate Time Series Systems

DTIC Science & Technology

1991-09-01

However, there is no guarantee that this would work; for instance if the data were generated by an ARCH model (Tong, 1990 pp. 116-117) then a simple...Hill, R., Griffiths, W., Lutkepohl, H., and Lee, T., Introduction to the Theory and Practice of Econometrics , 2th ed., Wiley, 1985. Kendall, M., Stuart
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes.

PubMed

Yates, Katherine L; Mellin, Camille; Caley, M Julian; Radford, Ben T; Meeuwig, Jessica J

2016-01-01

Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability.
Models of Marine Fish Biodiversity: Assessing Predictors from Three Habitat Classification Schemes

PubMed Central

Yates, Katherine L.; Mellin, Camille; Caley, M. Julian; Radford, Ben T.; Meeuwig, Jessica J.

2016-01-01

Prioritising biodiversity conservation requires knowledge of where biodiversity occurs. Such knowledge, however, is often lacking. New technologies for collecting biological and physical data coupled with advances in modelling techniques could help address these gaps and facilitate improved management outcomes. Here we examined the utility of environmental data, obtained using different methods, for developing models of both uni- and multivariate biodiversity metrics. We tested which biodiversity metrics could be predicted best and evaluated the performance of predictor variables generated from three types of habitat data: acoustic multibeam sonar imagery, predicted habitat classification, and direct observer habitat classification. We used boosted regression trees (BRT) to model metrics of fish species richness, abundance and biomass, and multivariate regression trees (MRT) to model biomass and abundance of fish functional groups. We compared model performance using different sets of predictors and estimated the relative influence of individual predictors. Models of total species richness and total abundance performed best; those developed for endemic species performed worst. Abundance models performed substantially better than corresponding biomass models. In general, BRT and MRTs developed using predicted habitat classifications performed less well than those using multibeam data. The most influential individual predictor was the abiotic categorical variable from direct observer habitat classification and models that incorporated predictors from direct observer habitat classification consistently outperformed those that did not. Our results show that while remotely sensed data can offer considerable utility for predictive modelling, the addition of direct observer habitat classification data can substantially improve model performance. Thus it appears that there are aspects of marine habitats that are important for modelling metrics of fish biodiversity that are not fully captured by remotely sensed data. As such, the use of remotely sensed data to model biodiversity represents a compromise between model performance and data availability. PMID:27333202
Establishment of a mathematic model for predicting malignancy in solitary pulmonary nodules.

PubMed

Zhang, Man; Zhuo, Na; Guo, Zhanlin; Zhang, Xingguang; Liang, Wenhua; Zhao, Sheng; He, Jianxing

2015-10-01

The aim of this study was to establish a model for predicting the probability of malignancy in solitary pulmonary nodules (SPNs) and provide guidance for the diagnosis and follow-up intervention of SPNs. We retrospectively analyzed the clinical data and computed tomography (CT) images of 294 patients with a clear pathological diagnosis of SPN. Multivariate logistic regression analysis was used to screen independent predictors of the probability of malignancy in the SPN and to establish a model for predicting malignancy in SPNs. Then, another 120 SPN patients who did not participate in the model establishment were chosen as group B and used to verify the accuracy of the prediction model. Multivariate logistic regression analysis showed that there were significant differences in age, smoking history, maximum diameter of nodules, spiculation, clear borders, and Cyfra21-1 levels between subgroups with benign and malignant SPNs (P<0.05). These factors were identified as independent predictors of malignancy in SPNs. The area under the curve (AUC) was 0.910 [95% confidence interval (CI), 0.857-0.963] in model with Cyfra21-1 significantly better than 0.812 (95% CI, 0.763-0.861) in model without Cyfra21-1 (P=0.008). The area under receiver operating characteristic (ROC) curve of our model is significantly higher than the Mayo model, VA model and Peking University People's (PKUPH) model. Our model (AUC =0.910) compared with Brock model (AUC =0.878, P=0.350), the difference was not statistically significant. The model added Cyfra21-1 could improve prediction. The prediction model established in this study can be used to assess the probability of malignancy in SPNs, thereby providing help for the diagnosis of SPNs and the selection of follow-up interventions.
MEMD-enhanced multivariate fuzzy entropy for the evaluation of complexity in biomedical signals.

PubMed

Azami, Hamed; Smith, Keith; Escudero, Javier

2016-08-01

Multivariate multiscale entropy (mvMSE) has been proposed as a combination of the coarse-graining process and multivariate sample entropy (mvSE) to quantify the irregularity of multivariate signals. However, both the coarse-graining process and mvSE may not be reliable for short signals. Although the coarse-graining process can be replaced with multivariate empirical mode decomposition (MEMD), the relative instability of mvSE for short signals remains a problem. Here, we address this issue by proposing the multivariate fuzzy entropy (mvFE) with a new fuzzy membership function. The results using white Gaussian noise show that the mvFE leads to more reliable and stable results, especially for short signals, in comparison with mvSE. Accordingly, we propose MEMD-enhanced mvFE to quantify the complexity of signals. The characteristics of brain regions influenced by partial epilepsy are investigated by focal and non-focal electroencephalogram (EEG) time series. In this sense, the proposed MEMD-enhanced mvFE and mvSE are employed to discriminate focal EEG signals from non-focal ones. The results demonstrate the MEMD-enhanced mvFE values have a smaller coefficient of variation in comparison with those obtained by the MEMD-enhanced mvSE, even for long signals. The results also show that the MEMD-enhanced mvFE has better performance to quantify focal and non-focal signals compared with multivariate multiscale permutation entropy.
The association between social marginalisation and the injecting of alcohol amongst IDUs in Budapest, Hungary.

PubMed

Gyarmathy, V Anna; Neaigus, Alan

2011-09-01

Alcohol injecting may cause intense irritation, serious vein damage, and additional risks. What little is known about alcohol injecting points to the potential role of social marginalisation. Injecting drug users (N=215) were recruited between October 2005 and December 2006 in Budapest, Hungary from non-treatment settings. Multivariate logistic regression models identified correlates of lifetime alcohol injecting. About a quarter (23%) reported ever injecting alcohol-only 3% reported injecting alcohol in the past 30 days. In multivariate analysis, six variables were statistically significantly associated with ever injecting alcohol: male gender, being homeless, ever sharing cookers or filters and injecting mostly in public places showed a positive association, whilst Roma ethnicity and working at least part time showed a negative association. Our study suggests that alcohol injecting is more of a rare event than a so far undiscovered research and prevention priority. Still, providers of harm reduction services should be aware that alcohol injecting happens, albeit rarely, especially amongst socially marginalised IDUs, who should be counselled about the risks of and discouraged from alcohol injecting. Copyright © 2011 Elsevier B.V. All rights reserved.
Higher clinical performance during a surgical clerkship is independently associated with matriculation of medical students into general surgery.

PubMed

Daly, Shaun C; Deal, Rebecca A; Rinewalt, Daniel E; Francescatti, Amanda B; Luu, Minh B; Millikan, Keith W; Anderson, Mary C; Myers, Jonathan A

2014-04-01

The purpose of our study was to determine the predictive impact of individual academic measures for the matriculation of senior medical students into a general surgery residency. Academic records were evaluated for third-year medical students (n = 781) at a single institution between 2004 and 2011. Cohorts were defined by student matriculation into either a general surgery residency program (n = 58) or a non-general surgery residency program (n = 723). Multivariate logistic regression was performed to evaluate independently significant academic measures. Clinical evaluation raw scores were predictive of general surgery matriculation (P = .014). In addition, multivariate modeling showed lower United States Medical Licensing Examination Step 1 scores to be independently associated with matriculation into general surgery (P = .007). Superior clinical aptitude is independently associated with general surgical matriculation. This is in contrast to the negative correlation United States Medical Licensing Examination Step 1 scores have on general surgery matriculation. Recognizing this, surgical clerkship directors can offer opportunities for continued surgical education to students showing high clinical aptitude, increasing their likelihood of surgical matriculation. Copyright © 2014 Elsevier Inc. All rights reserved.
Locating the Seventh Cervical Spinous Process: Development and Validation of a Multivariate Model Using Palpation and Personal Information.

PubMed

Ferreira, Ana Paula A; Póvoa, Luciana C; Zanier, José F C; Ferreira, Arthur S

2017-02-01

The aim of this study was to develop and validate a multivariate prediction model, guided by palpation and personal information, for locating the seventh cervical spinous process (C7SP). A single-blinded, cross-sectional study at a primary to tertiary health care center was conducted for model development and temporal validation. One-hundred sixty participants were prospectively included for model development (n = 80) and time-split validation stages (n = 80). The C7SP was located using the thorax-rib static method (TRSM). Participants underwent chest radiography for assessment of the inner body structure located with TRSM and using radio-opaque markers placed over the skin. Age, sex, height, body mass, body mass index, and vertex-marker distance (D V-M ) were used to predict the distance from the C7SP to the vertex (D V-C7 ). Multivariate linear regression modeling, limits of agreement plot, histogram of residues, receiver operating characteristic curves, and confusion tables were analyzed. The multivariate linear prediction model for D V-C7 (in centimeters) was D V-C7 = 0.986D V-M + 0.018(mass) + 0.014(age) - 1.008. Receiver operating characteristic curves had better discrimination of D V-C7 (area under the curve = 0.661; 95% confidence interval = 0.541-0.782; P = .015) than D V-M (area under the curve = 0.480; 95% confidence interval = 0.345-0.614; P = .761), with respective cutoff points at 23.40 cm (sensitivity = 41%, specificity = 63%) and 24.75 cm (sensitivity = 69%, specificity = 52%). The C7SP was correctly located more often when using predicted D V-C7 in the validation sample than when using the TRSM in the development sample: n = 53 (66%) vs n = 32 (40%), P < .001. Better accuracy was obtained when locating the C7SP by use of a multivariate model that incorporates palpation and personal information. Copyright © 2016. Published by Elsevier Inc.

DasPy – Open Source Multivariate Land Data Assimilation Framework with High Performance Computing

NASA Astrophysics Data System (ADS)

Han, Xujun; Li, Xin; Montzka, Carsten; Kollet, Stefan; Vereecken, Harry; Hendricks Franssen, Harrie-Jan

2015-04-01

Data assimilation has become a popular method to integrate observations from multiple sources with land surface models to improve predictions of the water and energy cycles of the soil-vegetation-atmosphere continuum. In recent years, several land data assimilation systems have been developed in different research agencies. Because of the software availability or adaptability, these systems are not easy to apply for the purpose of multivariate land data assimilation research. Multivariate data assimilation refers to the simultaneous assimilation of observation data for multiple model state variables into a simulation model. Our main motivation was to develop an open source multivariate land data assimilation framework (DasPy) which is implemented using the Python script language mixed with C++ and Fortran language. This system has been evaluated in several soil moisture, L-band brightness temperature and land surface temperature assimilation studies. The implementation allows also parameter estimation (soil properties and/or leaf area index) on the basis of the joint state and parameter estimation approach. LETKF (Local Ensemble Transform Kalman Filter) is implemented as the main data assimilation algorithm, and uncertainties in the data assimilation can be represented by perturbed atmospheric forcings, perturbed soil and vegetation properties and model initial conditions. The CLM4.5 (Community Land Model) was integrated as the model operator. The CMEM (Community Microwave Emission Modelling Platform), COSMIC (COsmic-ray Soil Moisture Interaction Code) and the two source formulation were integrated as observation operators for assimilation of L-band passive microwave, cosmic-ray soil moisture probe and land surface temperature measurements, respectively. DasPy is parallelized using the hybrid MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) techniques. All the input and output data flow is organized efficiently using the commonly used NetCDF file format. Online 1D and 2D visualization of data assimilation results is also implemented to facilitate the post simulation analysis. In summary, DasPy is a ready to use open source parallel multivariate land data assimilation framework.
Seizure-Onset Mapping Based on Time-Variant Multivariate Functional Connectivity Analysis of High-Dimensional Intracranial EEG: A Kalman Filter Approach.

PubMed

Lie, Octavian V; van Mierlo, Pieter

2017-01-01

The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (<60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach.
Quantitative monitoring of sucrose, reducing sugar and total sugar dynamics for phenotyping of water-deficit stress tolerance in rice through spectroscopy and chemometrics

NASA Astrophysics Data System (ADS)

Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini

2018-03-01

In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Influence analysis for high-dimensional time series with an application to epileptic seizure onset zone detection

PubMed Central

Flamm, Christoph; Graef, Andreas; Pirker, Susanne; Baumgartner, Christoph; Deistler, Manfred

2013-01-01

Granger causality is a useful concept for studying causal relations in networks. However, numerical problems occur when applying the corresponding methodology to high-dimensional time series showing co-movement, e.g. EEG recordings or economic data. In order to deal with these shortcomings, we propose a novel method for the causal analysis of such multivariate time series based on Granger causality and factor models. We present the theoretical background, successfully assess our methodology with the help of simulated data and show a potential application in EEG analysis of epileptic seizures. PMID:23354014
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.

PubMed

Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P

2015-11-01

To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Chemiluminescence-based multivariate sensing of local equivalence ratios in premixed atmospheric methane-air flames

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.

Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

PubMed

Laurens, L M L; Wolfrum, E J

2013-12-18

One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

NASA Astrophysics Data System (ADS)

Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

2017-12-01

An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Risk factor assessment to anticipate performance in the National Developmental Screening Test in children from a disadvantaged area.

PubMed

Montes, Alejandro; Pazos, Gustavo

2016-02-01

Identifying children at risk of failing the National Developmental Screening Test by combining prevalences of children suspected of having inapparent developmental disorders (IDDs) and associated risk factors (RFs) would allow to save resources. 1. To estimate the prevalence of children suspected of having IDDs. 2. To identify associated RFs. 3. To assess three methods developed based on observed RFs and propose a pre-screening procedure. The National Developmental Screening Test was administered to 60 randomly selected children aged between 2 and 4 years old from a socioeconomically disadvantaged area from Puerto Madryn. Twenty-four biological and socioenvironmental outcome measures were assessed in order to identify potential RFs using bivariate and multivariate analyses. The likelihood of failing the screening test was estimated as follows: 1. a multivariate logistic regression model was developed; 2. a relationship was established between the number of RFs present in each child and the percentage of children who failed the test; 3. these two methods were combined. The prevalence of children suspected of having IDDs was 55.0% (95% confidence interval: 42.4%-67.6%). Six RFs were initially identified using the bivariate approach. Three of them (maternal education, number of health checkups and Z scores for height-for-age, and maternal age) were included in the logistic regression model, which has a greater explanatory power. The third method included in the assessment showed greater sensitivity and specificity (85% and 79%, respectively). The estimated prevalence of children suspected of having IDDs was four times higher than the national standards. Seven RFs were identified. Combining the analysis of risk factor accumulation and a multivariate model provides a firm basis for developing a sensitive, specific and practical pre-screening procedure for socioeconomically disadvantaged areas. Sociedad Argentina de Pediatría.
Preventing Unintended Pregnancy Among Young Sexually Active Women: Recognizing the Role of Violence, Self-Esteem, and Depressive Symptoms on Use of Contraception.

PubMed

Nelson, Deborah B; Zhao, Huaqing; Corrado, Rachel; Mastrogiannnis, Dimitrios M; Lepore, Stephen J

2017-04-01

Ineffective contraceptive use among young sexually active women is extremely prevalent and poses a significant risk for unintended pregnancy (UP). Ineffective contraception involves the use of the withdrawal method or the inconsistent use of other types of contraception (i.e., condoms and birth control pills). This investigation examined violence exposure and psychological factors related to ineffective contraceptive use among young sexually active women. Young, nonpregnant sexually active women (n = 315) were recruited from an urban family planning clinic in 2013 to participate in a longitudinal study. Tablet-based surveys measured childhood violence, community-level violence, intimate partner violence, depressive symptoms, and self-esteem. Follow-up surveys measured type and consistency of contraception used 9 months later. Multivariate logistic regression models assessed violence and psychological risk factors as main effects and moderators related to ineffective compared with effective use of contraception. The multivariate logistic regression model showed that childhood sexual violence and low self-esteem were significantly related to ineffective use of contraception (adjusted odds ratio [aOR] = 2.69, confidence interval [95% CI]: 1.18-6.17, and aOR = 0.51, 95% CI: 0.28-0.93; respectively), although self-esteem did not moderate the relationship between childhood sexual violence and ineffective use of contraception (aOR = 0.38, 95% CI: 0.08-1.84). Depressive symptoms were not related to ineffective use of contraception in the multivariate model. Interventions to reduce UP should recognize the long-term effects of childhood sexual violence and address the role of low self-esteem on the ability of young sexually active women to effectively and consistently use contraception to prevent UP.
Preventing Unintended Pregnancy Among Young Sexually Active Women: Recognizing the Role of Violence, Self-Esteem, and Depressive Symptoms on Use of Contraception

PubMed Central

Zhao, Huaqing; Corrado, Rachel; Mastrogiannnis, Dimitrios M.; Lepore, Stephen J.

2017-01-01

Abstract Objectives: Ineffective contraceptive use among young sexually active women is extremely prevalent and poses a significant risk for unintended pregnancy (UP). Ineffective contraception involves the use of the withdrawal method or the inconsistent use of other types of contraception (i.e., condoms and birth control pills). This investigation examined violence exposure and psychological factors related to ineffective contraceptive use among young sexually active women. Materials and Methods: Young, nonpregnant sexually active women (n = 315) were recruited from an urban family planning clinic in 2013 to participate in a longitudinal study. Tablet-based surveys measured childhood violence, community-level violence, intimate partner violence, depressive symptoms, and self-esteem. Follow-up surveys measured type and consistency of contraception used 9 months later. Multivariate logistic regression models assessed violence and psychological risk factors as main effects and moderators related to ineffective compared with effective use of contraception. Results: The multivariate logistic regression model showed that childhood sexual violence and low self-esteem were significantly related to ineffective use of contraception (adjusted odds ratio [aOR] = 2.69, confidence interval [95% CI]: 1.18–6.17, and aOR = 0.51, 95% CI: 0.28–0.93; respectively), although self-esteem did not moderate the relationship between childhood sexual violence and ineffective use of contraception (aOR = 0.38, 95% CI: 0.08–1.84). Depressive symptoms were not related to ineffective use of contraception in the multivariate model. Conclusions: Interventions to reduce UP should recognize the long-term effects of childhood sexual violence and address the role of low self-esteem on the ability of young sexually active women to effectively and consistently use contraception to prevent UP. PMID:28045570
Local polynomial estimation of heteroscedasticity in a multivariate linear regression model and its applications in economics.

PubMed

Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan

2012-01-01

Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
Multivariate calibration on NIR data: development of a model for the rapid evaluation of ethanol content in bakery products.

PubMed

Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena

2007-11-05

A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities

NASA Astrophysics Data System (ADS)

Vallières, M.; Freeman, C. R.; Skamene, S. R.; El Naqa, I.

2015-07-01

This study aims at developing a joint FDG-PET and MRI texture-based model for the early evaluation of lung metastasis risk in soft-tissue sarcomas (STSs). We investigate if the creation of new composite textures from the combination of FDG-PET and MR imaging information could better identify aggressive tumours. Towards this goal, a cohort of 51 patients with histologically proven STSs of the extremities was retrospectively evaluated. All patients had pre-treatment FDG-PET and MRI scans comprised of T1-weighted and T2-weighted fat-suppression sequences (T2FS). Nine non-texture features (SUV metrics and shape features) and forty-one texture features were extracted from the tumour region of separate (FDG-PET, T1 and T2FS) and fused (FDG-PET/T1 and FDG-PET/T2FS) scans. Volume fusion of the FDG-PET and MRI scans was implemented using the wavelet transform. The influence of six different extraction parameters on the predictive value of textures was investigated. The incorporation of features into multivariable models was performed using logistic regression. The multivariable modeling strategy involved imbalance-adjusted bootstrap resampling in the following four steps leading to final prediction model construction: (1) feature set reduction; (2) feature selection; (3) prediction performance estimation; and (4) computation of model coefficients. Univariate analysis showed that the isotropic voxel size at which texture features were extracted had the most impact on predictive value. In multivariable analysis, texture features extracted from fused scans significantly outperformed those from separate scans in terms of lung metastases prediction estimates. The best performance was obtained using a combination of four texture features extracted from FDG-PET/T1 and FDG-PET/T2FS scans. This model reached an area under the receiver-operating characteristic curve of 0.984 ± 0.002, a sensitivity of 0.955 ± 0.006, and a specificity of 0.926 ± 0.004 in bootstrapping evaluations. Ultimately, lung metastasis risk assessment at diagnosis of STSs could improve patient outcomes by allowing better treatment adaptation.
Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

USGS Publications Warehouse

Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

2003-01-01

Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Modeling of turbulent supersonic H2-air combustion with a multivariate beta PDF

NASA Technical Reports Server (NTRS)

Baurle, R. A.; Hassan, H. A.

1993-01-01

Recent calculations of turbulent supersonic reacting shear flows using an assumed multivariate beta PDF (probability density function) resulted in reduced production rates and a delay in the onset of combustion. This result is not consistent with available measurements. The present research explores two possible reasons for this behavior: use of PDF's that do not yield Favre averaged quantities, and the gradient diffusion assumption. A new multivariate beta PDF involving species densities is introduced which makes it possible to compute Favre averaged mass fractions. However, using this PDF did not improve comparisons with experiment. A countergradient diffusion model is then introduced. Preliminary calculations suggest this to be the cause of the discrepancy.
Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies

PubMed Central

Inouye, David I.; Ravikumar, Pradeep; Dhillon, Inderjit S.

2016-01-01

We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models (Yang et al., 2015) did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with ℓ1 regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times. PMID:27563373
Integrated GIS and multivariate statistical analysis for regional scale assessment of heavy metal soil contamination: A critical review.

PubMed

Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan

2017-12-01

Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.
Optimal moment determination in POME-copula based hydrometeorological dependence modelling

NASA Astrophysics Data System (ADS)

Liu, Dengfeng; Wang, Dong; Singh, Vijay P.; Wang, Yuankun; Wu, Jichun; Wang, Lachun; Zou, Xinqing; Chen, Yuanfang; Chen, Xi

2017-07-01

Copula has been commonly applied in multivariate modelling in various fields where marginal distribution inference is a key element. To develop a flexible, unbiased mathematical inference framework in hydrometeorological multivariate applications, the principle of maximum entropy (POME) is being increasingly coupled with copula. However, in previous POME-based studies, determination of optimal moment constraints has generally not been considered. The main contribution of this study is the determination of optimal moments for POME for developing a coupled optimal moment-POME-copula framework to model hydrometeorological multivariate events. In this framework, margins (marginals, or marginal distributions) are derived with the use of POME, subject to optimal moment constraints. Then, various candidate copulas are constructed according to the derived margins, and finally the most probable one is determined, based on goodness-of-fit statistics. This optimal moment-POME-copula framework is applied to model the dependence patterns of three types of hydrometeorological events: (i) single-site streamflow-water level; (ii) multi-site streamflow; and (iii) multi-site precipitation, with data collected from Yichang and Hankou in the Yangtze River basin, China. Results indicate that the optimal-moment POME is more accurate in margin fitting and the corresponding copulas reflect a good statistical performance in correlation simulation. Also, the derived copulas, capturing more patterns which traditional correlation coefficients cannot reflect, provide an efficient way in other applied scenarios concerning hydrometeorological multivariate modelling.
Quantum Attack-Resistent Certificateless Multi-Receiver Signcryption Scheme

PubMed Central

Li, Huixian; Chen, Xubao; Pang, Liaojun; Shi, Weisong

2013-01-01

The existing certificateless signcryption schemes were designed mainly based on the traditional public key cryptography, in which the security relies on the hard problems, such as factor decomposition and discrete logarithm. However, these problems will be easily solved by the quantum computing. So the existing certificateless signcryption schemes are vulnerable to the quantum attack. Multivariate public key cryptography (MPKC), which can resist the quantum attack, is one of the alternative solutions to guarantee the security of communications in the post-quantum age. Motivated by these concerns, we proposed a new construction of the certificateless multi-receiver signcryption scheme (CLMSC) based on MPKC. The new scheme inherits the security of MPKC, which can withstand the quantum attack. Multivariate quadratic polynomial operations, which have lower computation complexity than bilinear pairing operations, are employed in signcrypting a message for a certain number of receivers in our scheme. Security analysis shows that our scheme is a secure MPKC-based scheme. We proved its security under the hardness of the Multivariate Quadratic (MQ) problem and its unforgeability under the Isomorphism of Polynomials (IP) assumption in the random oracle model. The analysis results show that our scheme also has the security properties of non-repudiation, perfect forward secrecy, perfect backward secrecy and public verifiability. Compared with the existing schemes in terms of computation complexity and ciphertext length, our scheme is more efficient, which makes it suitable for terminals with low computation capacity like smart cards. PMID:23967037

Assessing Multivariate Constraints to Evolution across Ten Long-Term Avian Studies

PubMed Central

Teplitsky, Celine; Tarka, Maja; Møller, Anders P.; Nakagawa, Shinichi; Balbontín, Javier; Burke, Terry A.; Doutrelant, Claire; Gregoire, Arnaud; Hansson, Bengt; Hasselquist, Dennis; Gustafsson, Lars; de Lope, Florentino; Marzal, Alfonso; Mills, James A.; Wheelwright, Nathaniel T.; Yarrall, John W.; Charmantier, Anne

2014-01-01

Background In a rapidly changing world, it is of fundamental importance to understand processes constraining or facilitating adaptation through microevolution. As different traits of an organism covary, genetic correlations are expected to affect evolutionary trajectories. However, only limited empirical data are available. Methodology/Principal Findings We investigate the extent to which multivariate constraints affect the rate of adaptation, focusing on four morphological traits often shown to harbour large amounts of genetic variance and considered to be subject to limited evolutionary constraints. Our data set includes unique long-term data for seven bird species and a total of 10 populations. We estimate population-specific matrices of genetic correlations and multivariate selection coefficients to predict evolutionary responses to selection. Using Bayesian methods that facilitate the propagation of errors in estimates, we compare (1) the rate of adaptation based on predicted response to selection when including genetic correlations with predictions from models where these genetic correlations were set to zero and (2) the multivariate evolvability in the direction of current selection to the average evolvability in random directions of the phenotypic space. We show that genetic correlations on average decrease the predicted rate of adaptation by 28%. Multivariate evolvability in the direction of current selection was systematically lower than average evolvability in random directions of space. These significant reductions in the rate of adaptation and reduced evolvability were due to a general nonalignment of selection and genetic variance, notably orthogonality of directional selection with the size axis along which most (60%) of the genetic variance is found. Conclusions These results suggest that genetic correlations can impose significant constraints on the evolution of avian morphology in wild populations. This could have important impacts on evolutionary dynamics and hence population persistence in the face of rapid environmental change. PMID:24608111
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens

We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models

DOE PAGES

Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens; ...

2016-12-15

We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
On Fitting a Multivariate Two-Part Latent Growth Model

PubMed Central

Xu, Shu; Blozis, Shelley A.; Vandewater, Elizabeth A.

2017-01-01

A 2-part latent growth model can be used to analyze semicontinuous data to simultaneously study change in the probability that an individual engages in a behavior, and if engaged, change in the behavior. This article uses a Monte Carlo (MC) integration algorithm to study the interrelationships between the growth factors of 2 variables measured longitudinally where each variable can follow a 2-part latent growth model. A SAS macro implementing Mplus is developed to estimate the model to take into account the sampling uncertainty of this simulation-based computational approach. A sample of time-use data is used to show how maximum likelihood estimates can be obtained using a rectangular numerical integration method and an MC integration method. PMID:29333054
Design, evaluation and test of an electronic, multivariable control for the F100 turbofan engine

NASA Technical Reports Server (NTRS)

Skira, C. A.; Dehoff, R. L.; Hall, W. E., Jr.

1980-01-01

A digital, multivariable control design procedure for the F100 turbofan engine is described. The controller is based on locally linear synthesis techniques using linear, quadratic regulator design methods. The control structure uses an explicit model reference form with proportional and integral feedback near a nominal trajectory. Modeling issues, design procedures for the control law and the estimation of poorly measured variables are presented.
Copula Multivariate analysis of Gross primary production and its hydro-environmental driver; A BIOME-BGC model applied to the Antisana páramos

NASA Astrophysics Data System (ADS)

Minaya, Veronica; Corzo, Gerald; van der Kwast, Johannes; Galarraga, Remigio; Mynett, Arthur

2014-05-01

Simulations of carbon cycling are prone to uncertainties from different sources, which in general are related to input data, parameters and the model representation capacities itself. The gross carbon uptake in the cycle is represented by the gross primary production (GPP), which deals with the spatio-temporal variability of the precipitation and the soil moisture dynamics. This variability associated with uncertainty of the parameters can be modelled by multivariate probabilistic distributions. Our study presents a novel methodology that uses multivariate Copulas analysis to assess the GPP. Multi-species and elevations variables are included in a first scenario of the analysis. Hydro-meteorological conditions that might generate a change in the next 50 or more years are included in a second scenario of this analysis. The biogeochemical model BIOME-BGC was applied in the Ecuadorian Andean region in elevations greater than 4000 masl with the presence of typical vegetation of páramo. The change of GPP over time is crucial for climate scenarios of the carbon cycling in this type of ecosystem. The results help to improve our understanding of the ecosystem function and clarify the dynamics and the relationship with the change of climate variables. Keywords: multivariate analysis, Copula, BIOME-BGC, NPP, páramos
Gaussian Mixture Models of Between-Source Variation for Likelihood Ratio Computation from Multivariate Data

PubMed Central

Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin

2016-01-01

In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680
Implementation Challenges for Multivariable Control: What You Did Not Learn in School

NASA Technical Reports Server (NTRS)

Garg, Sanjay

2008-01-01

Multivariable control allows controller designs that can provide decoupled command tracking and robust performance in the presence of modeling uncertainties. Although the last two decades have seen extensive development of multivariable control theory and example applications to complex systems in software/hardware simulations, there are no production flying systems aircraft or spacecraft, that use multivariable control. This is because of the tremendous challenges associated with implementation of such multivariable control designs. Unfortunately, the curriculum in schools does not provide sufficient time to be able to provide an exposure to the students in such implementation challenges. The objective of this paper is to share the lessons learned by a practitioner of multivariable control in the process of applying some of the modern control theory to the Integrated Flight Propulsion Control (IFPC) design for an advanced Short Take-Off Vertical Landing (STOVL) aircraft simulation.
Combining Frequency Doubling Technology Perimetry and Scanning Laser Polarimetry for Glaucoma Detection.

PubMed

Mwanza, Jean-Claude; Warren, Joshua L; Hochberg, Jessica T; Budenz, Donald L; Chang, Robert T; Ramulu, Pradeep Y

2015-01-01

To determine the ability of frequency doubling technology (FDT) and scanning laser polarimetry with variable corneal compensation (GDx-VCC) to detect glaucoma when used individually and in combination. One hundred ten normal and 114 glaucomatous subjects were tested with FDT C-20-5 screening protocol and the GDx-VCC. The discriminating ability was tested for each device individually and for both devices combined using GDx-NFI, GDx-TSNIT, number of missed points of FDT, and normal or abnormal FDT. Measures of discrimination included sensitivity, specificity, area under the curve (AUC), Akaike's information criterion (AIC), and prediction confidence interval lengths. For detecting glaucoma regardless of severity, the multivariable model resulting from the combination of GDx-TSNIT, number of abnormal points on FDT (NAP-FDT), and the interaction GDx-TSNIT×NAP-FDT (AIC: 88.28, AUC: 0.959, sensitivity: 94.6%, specificity: 89.5%) outperformed the best single-variable model provided by GDx-NFI (AIC: 120.88, AUC: 0.914, sensitivity: 87.8%, specificity: 84.2%). The multivariable model combining GDx-TSNIT, NAP-FDT, and interaction GDx-TSNIT×NAP-FDT consistently provided better discriminating abilities for detecting early, moderate, and severe glaucoma than the best single-variable models. The multivariable model including GDx-TSNIT, NAP-FDT, and the interaction GDx-TSNIT×NAP-FDT provides the best glaucoma prediction compared with all other multivariable and univariable models. Combining the FDT C-20-5 screening protocol and GDx-VCC improves glaucoma detection compared with using GDx or FDT alone.
Comparative evaluation of spectroscopic models using different multivariate statistical tools in a multicancer scenario

NASA Astrophysics Data System (ADS)

Ghanate, A. D.; Kothiwale, S.; Singh, S. P.; Bertrand, Dominique; Krishna, C. Murali

2011-02-01

Cancer is now recognized as one of the major causes of morbidity and mortality. Histopathological diagnosis, the gold standard, is shown to be subjective, time consuming, prone to interobserver disagreement, and often fails to predict prognosis. Optical spectroscopic methods are being contemplated as adjuncts or alternatives to conventional cancer diagnostics. The most important aspect of these approaches is their objectivity, and multivariate statistical tools play a major role in realizing it. However, rigorous evaluation of the robustness of spectral models is a prerequisite. The utility of Raman spectroscopy in the diagnosis of cancers has been well established. Until now, the specificity and applicability of spectral models have been evaluated for specific cancer types. In this study, we have evaluated the utility of spectroscopic models representing normal and malignant tissues of the breast, cervix, colon, larynx, and oral cavity in a broader perspective, using different multivariate tests. The limit test, which was used in our earlier study, gave high sensitivity but suffered from poor specificity. The performance of other methods such as factorial discriminant analysis and partial least square discriminant analysis are at par with more complex nonlinear methods such as decision trees, but they provide very little information about the classification model. This comparative study thus demonstrates not just the efficacy of Raman spectroscopic models but also the applicability and limitations of different multivariate tools for discrimination under complex conditions such as the multicancer scenario.
A comparison of portfolio selection models via application on ISE 100 index data

NASA Astrophysics Data System (ADS)

Altun, Emrah; Tatlidil, Hüseyin

2013-10-01

Markowitz Model, a classical approach to portfolio optimization problem, relies on two important assumptions: the expected return is multivariate normally distributed and the investor is risk averter. But this model has not been extensively used in finance. Empirical results show that it is very hard to solve large scale portfolio optimization problems with Mean-Variance (M-V)model. Alternative model, Mean Absolute Deviation (MAD) model which is proposed by Konno and Yamazaki [7] has been used to remove most of difficulties of Markowitz Mean-Variance model. MAD model don't need to assume that the probability of the rates of return is normally distributed and based on Linear Programming. Another alternative portfolio model is Mean-Lower Semi Absolute Deviation (M-LSAD), which is proposed by Speranza [3]. We will compare these models to determine which model gives more appropriate solution to investors.
Atrial Electrogram Fractionation Distribution before and after Pulmonary Vein Isolation in Human Persistent Atrial Fibrillation-A Retrospective Multivariate Statistical Analysis.

PubMed

Almeida, Tiago P; Chu, Gavin S; Li, Xin; Dastagir, Nawshin; Tuan, Jiun H; Stafford, Peter J; Schlindwein, Fernando S; Ng, G André

2017-01-01

Purpose: Complex fractionated atrial electrograms (CFAE)-guided ablation after pulmonary vein isolation (PVI) has been used for persistent atrial fibrillation (persAF) therapy. This strategy has shown suboptimal outcomes due to, among other factors, undetected changes in the atrial tissue following PVI. In the present work, we investigate CFAE distribution before and after PVI in patients with persAF using a multivariate statistical model. Methods: 207 pairs of atrial electrograms (AEGs) were collected before and after PVI respectively, from corresponding LA regions in 18 persAF patients. Twelve attributes were measured from the AEGs, before and after PVI. Statistical models based on multivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) have been used to characterize the atrial regions and AEGs. Results: PVI significantly reduced CFAEs in the LA (70 vs. 40%; P < 0.0001). Four types of LA regions were identified, based on the AEGs characteristics: (i) fractionated before PVI that remained fractionated after PVI (31% of the collected points); (ii) fractionated that converted to normal (39%); (iii) normal prior to PVI that became fractionated (9%) and; (iv) normal that remained normal (21%). Individually, the attributes failed to distinguish these LA regions, but multivariate statistical models were effective in their discrimination ( P < 0.0001). Conclusion: Our results have unveiled that there are LA regions resistant to PVI, while others are affected by it. Although, traditional methods were unable to identify these different regions, the proposed multivariate statistical model discriminated LA regions resistant to PVI from those affected by it without prior ablation information.
Simulation analysis of adaptive cruise prediction control

NASA Astrophysics Data System (ADS)

Zhang, Li; Cui, Sheng Min

2017-09-01

Predictive control is suitable for multi-variable and multi-constraint system control.In order to discuss the effect of predictive control on the vehicle longitudinal motion, this paper establishes the expected spacing model by combining variable pitch spacing and the of safety distance strategy. The model predictive control theory and the optimization method based on secondary planning are designed to obtain and track the best expected acceleration trajectory quickly. Simulation models are established including predictive and adaptive fuzzy control. Simulation results show that predictive control can realize the basic function of the system while ensuring the safety. The application of predictive and fuzzy adaptive algorithm in cruise condition indicates that the predictive control effect is better.
Factors related to clinical pregnancy after vitrified-warmed embryo transfer: a retrospective and multivariate logistic regression analysis of 2313 transfer cycles.

PubMed

Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi

2013-07-01

What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic curve (ROC): 0.5576-0.6000). AH also showed a high OR (OR: 2.105, CI: 1.554-2.85) and high PEV (marginal 1.97%; partial 1.02%; CI of area under ROC: 0.5344-0.5647). The number of good-quality embryos showed the highest marginal PEV and partial PEV (marginal 3.91%, partial 2.28%; CI of area under ROC: 0.5886-0.6343). This was a retrospective multivariate analysis of the data obtained in 5 years from a single IVF center. Repeated cycles in the same woman were treated as independent observations, which could introduce bias. Results are based on clinical pregnancy and not live births. Prospective analysis of a larger data set from a multicenter study based on live births is necessary to confirm the findings. Paying attention to the quality of embryos, the number of good embryos, AH and the reasons for freezing that are associated with clinical pregnancy after VET will assist the improvement of success rates.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-01-20

Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-02-01

Prediction models are developed to aid healthcare providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision-making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) initiative developed a set of recommendations for the reporting of studies developing, validating or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, healthcare professionals and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Stichting European Society for Clinical Investigation Journal Foundation.
Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-06

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD)

PubMed Central

Reitsma, Johannes B.; Altman, Douglas G.; Moons, Karel G.M.

2015-01-01

Background— Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. Methods— The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. Results— The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. Conclusions— To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25561516
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement

PubMed Central

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-01-01

Prediction models are developed to aid health-care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health-care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). PMID:25562432
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, G S; Reitsma, J B; Altman, D G; Moons, K G M

2015-02-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 Royal College of Obstetricians and Gynaecologists.

Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. The TRIPOD Group.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-13

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). © 2015 The Authors.
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-01-06

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org).
Transparent reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement.

PubMed

Collins, Gary S; Reitsma, Johannes B; Altman, Douglas G; Moons, Karel G M

2015-02-01

Prediction models are developed to aid health care providers in estimating the probability or risk that a specific disease or condition is present (diagnostic models) or that a specific event will occur in the future (prognostic models), to inform their decision making. However, the overwhelming evidence shows that the quality of reporting of prediction model studies is poor. Only with full and clear reporting of information on all aspects of a prediction model can risk of bias and potential usefulness of prediction models be adequately assessed. The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed a set of recommendations for the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. This article describes how the TRIPOD Statement was developed. An extensive list of items based on a review of the literature was created, which was reduced after a Web-based survey and revised during a 3-day meeting in June 2011 with methodologists, health care professionals, and journal editors. The list was refined during several meetings of the steering group and in e-mail discussions with the wider group of TRIPOD contributors. The resulting TRIPOD Statement is a checklist of 22 items, deemed essential for transparent reporting of a prediction model study. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. The TRIPOD Statement is best used in conjunction with the TRIPOD explanation and elaboration document. To aid the editorial process and readers of prediction model studies, it is recommended that authors include a completed checklist in their submission (also available at www.tripod-statement.org). Copyright © 2015 Elsevier Inc. All rights reserved.
Data-Driven Nonlinear Subspace Modeling for Prediction and Control of Molten Iron Quality Indices in Blast Furnace Ironmaking

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Ping; Song, Heda; Wang, Hong

Blast furnace (BF) in ironmaking is a nonlinear dynamic process with complicated physical-chemical reactions, where multi-phase and multi-field coupling and large time delay occur during its operation. In BF operation, the molten iron temperature (MIT) as well as Si, P and S contents of molten iron are the most essential molten iron quality (MIQ) indices, whose measurement, modeling and control have always been important issues in metallurgic engineering and automation field. This paper develops a novel data-driven nonlinear state space modeling for the prediction and control of multivariate MIQ indices by integrating hybrid modeling and control techniques. First, to improvemore » modeling efficiency, a data-driven hybrid method combining canonical correlation analysis and correlation analysis is proposed to identify the most influential controllable variables as the modeling inputs from multitudinous factors would affect the MIQ indices. Then, a Hammerstein model for the prediction of MIQ indices is established using the LS-SVM based nonlinear subspace identification method. Such a model is further simplified by using piecewise cubic Hermite interpolating polynomial method to fit the complex nonlinear kernel function. Compared to the original Hammerstein model, this simplified model can not only significantly reduce the computational complexity, but also has almost the same reliability and accuracy for a stable prediction of MIQ indices. Last, in order to verify the practicability of the developed model, it is applied in designing a genetic algorithm based nonlinear predictive controller for multivariate MIQ indices by directly taking the established model as a predictor. Industrial experiments show the advantages and effectiveness of the proposed approach.« less
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012

PubMed Central

Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau

2016-01-01

Background The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. Methods In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). Results The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Conclusion Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis. PMID:26901682
Time Series Modelling of Syphilis Incidence in China from 2005 to 2012.

PubMed

Zhang, Xingyu; Zhang, Tao; Pei, Jiao; Liu, Yuanyuan; Li, Xiaosong; Medrano-Gracia, Pau

2016-01-01

The infection rate of syphilis in China has increased dramatically in recent decades, becoming a serious public health concern. Early prediction of syphilis is therefore of great importance for heath planning and management. In this paper, we analyzed surveillance time series data for primary, secondary, tertiary, congenital and latent syphilis in mainland China from 2005 to 2012. Seasonality and long-term trend were explored with decomposition methods. Autoregressive integrated moving average (ARIMA) was used to fit a univariate time series model of syphilis incidence. A separate multi-variable time series for each syphilis type was also tested using an autoregressive integrated moving average model with exogenous variables (ARIMAX). The syphilis incidence rates have increased three-fold from 2005 to 2012. All syphilis time series showed strong seasonality and increasing long-term trend. Both ARIMA and ARIMAX models fitted and estimated syphilis incidence well. All univariate time series showed highest goodness-of-fit results with the ARIMA(0,0,1)×(0,1,1) model. Time series analysis was an effective tool for modelling the historical and future incidence of syphilis in China. The ARIMAX model showed superior performance than the ARIMA model for the modelling of syphilis incidence. Time series correlations existed between the models for primary, secondary, tertiary, congenital and latent syphilis.
The upper respiratory pyramid: early factors and later treatment utilization in World Trade Center exposed firefighters.

PubMed

Niles, Justin K; Webber, Mayris P; Liu, Xiaoxue; Zeig-Owens, Rachel; Hall, Charles B; Cohen, Hillel W; Glaser, Michelle S; Weakley, Jessica; Schwartz, Theresa M; Weiden, Michael D; Nolan, Anna; Aldrich, Thomas K; Glass, Lara; Kelly, Kerry J; Prezant, David J

2014-08-01

We investigated early post 9/11 factors that could predict rhinosinusitis healthcare utilization costs up to 11 years later in 8,079 World Trade Center-exposed rescue/recovery workers. We used bivariate and multivariate analytic techniques to investigate utilization outcomes; we also used a pyramid framework to describe rhinosinusitis healthcare groups at early (by 9/11/2005) and late (by 9/11/2012) time points. Multivariate models showed that pre-9/11/2005 chronic rhinosinusitis diagnoses and nasal symptoms predicted final year healthcare utilization outcomes more than a decade after WTC exposure. The relative proportion of workers on each pyramid level changed significantly during the study period. Diagnoses of chronic rhinosinusitis within 4 years of a major inhalation event only partially explain future healthcare utilization. Exposure intensity, early symptoms and other factors must also be considered when anticipating future healthcare needs. © 2014 Wiley Periodicals, Inc.
Prolonged instability prior to a regime shift

USGS Publications Warehouse

Spanbauer, Trisha; Allen, Craig R.; Angeler, David G.; Eason, Tarsha; Fritz, Sherilyn C.; Garmestani, Ahjond S.; Nash, Kirsty L.; Stone, Jeffery R.

2014-01-01

Regime shifts are generally defined as the point of ‘abrupt’ change in the state of a system. However, a seemingly abrupt transition can be the product of a system reorganization that has been ongoing much longer than is evident in statistical analysis of a single component of the system. Using both univariate and multivariate statistical methods, we tested a long-term high-resolution paleoecological dataset with a known change in species assemblage for a regime shift. Analysis of this dataset with Fisher Information and multivariate time series modeling showed that there was a∼2000 year period of instability prior to the regime shift. This period of instability and the subsequent regime shift coincide with regional climate change, indicating that the system is undergoing extrinsic forcing. Paleoecological records offer a unique opportunity to test tools for the detection of thresholds and stable-states, and thus to examine the long-term stability of ecosystems over periods of multiple millennia.
The Upper Respiratory Pyramid: Early Factors and Later Treatment Utilization in World Trade Center Exposed Firefighters

PubMed Central

Niles, Justin K.; Webber, Mayris P.; Liu, Xiaoxue; Zeig-Owens, Rachel; Hall, Charles B.; Cohen, Hillel W.; Glaser, Michelle S.; Weakley, Jessica; Schwartz, Theresa M.; Weiden, Michael D.; Nolan, Anna; Aldrich, Thomas K.; Glass, Lara; Kelly, Kerry J.; Prezant, David J.

2015-01-01

Background We investigated early post 9/11 factors that could predict rhinosinusitis healthcare utilization costs up to 11 years later in 8,079 World Trade Center-exposed rescue/recovery workers. Methods We used bivariate and multivariate analytic techniques to investigate utilization outcomes; we also used a pyramid framework to describe rhinosinusitis healthcare groups at early (by 9/11/2005) and late (by 9/11/2012) time points. Results Multivariate models showed that pre-9/11/2005 chronic rhinosinusitis diagnoses and nasal symptoms predicted final year healthcare utilization outcomes more than a decade after WTC exposure. The relative proportion of workers on each pyramid level changed significantly during the study period. Conclusions Diagnoses of chronic rhinosinusitis within 4 years of a major inhalation event only partially explain future healthcare utilization. Exposure intensity, early symptoms and other factors must also be considered when anticipating future healthcare needs. PMID:24898816
Narcissistic Personality Disorder and the Structure of Common Mental Disorders.

PubMed

Eaton, Nicholas R; Rodriguez-Seijas, Craig; Krueger, Robert F; Campbell, W Keith; Grant, Bridget F; Hasin, Deborah S

2017-08-01

Narcissistic personality disorder (NPD) shows high rates of comorbidity with mood, anxiety, substance use, and other personality disorders. Previous bivariate comorbidity investigations have left NPD multivariate comorbidity patterns poorly understood. Structural psychopathology research suggests that two transdiagnostic factors, internalizing (with distress and fear subfactors) and externalizing, account for comorbidity among common mental disorders. NPD has rarely been evaluated within this framework, with studies producing equivocal results. We investigated how NPD related to other mental disorders in the internalizing-externalizing model using diagnoses from a nationally representative sample (N = 34,653). NPD was best conceptualized as a distress disorder. NPD variance accounted for by transdiagnostic factors was modest, suggesting its variance is largely unique in the context of other common mental disorders. Results clarify NPD multivariate comorbidity, suggest avenues for classification and clinical endeavors, and highlight the need to understand vulnerable and grandiose narcissism subtypes' comorbidity patterns and structural relations.
Discrimination of inflammatory bowel disease using Raman spectroscopy and linear discriminant analysis methods

NASA Astrophysics Data System (ADS)

Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong

2016-03-01

Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
A study of pH-dependent photodegradation of amiloride by a multivariate curve resolution approach to combined kinetic and acid-base titration UV data.

PubMed

De Luca, Michele; Ioele, Giuseppina; Mas, Sílvia; Tauler, Romà; Ragno, Gaetano

2012-11-21

Amiloride photostability at different pH values was studied in depth by applying Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) to the UV spectrophotometric data from drug solutions exposed to stressing irradiation. Resolution of all degradation photoproducts was possible by simultaneous spectrophotometric analysis of kinetic photodegradation and acid-base titration experiments. Amiloride photodegradation showed to be strongly dependent on pH. Two hard modelling constraints were sequentially used in MCR-ALS for the unambiguous resolution of all the species involved in the photodegradation process. An amiloride acid-base system was defined by using the equilibrium constraint, and the photodegradation pathway was modelled taking into account the kinetic constraint. The simultaneous analysis of photodegradation and titration experiments revealed the presence of eight different species, which were differently distributed according to pH and time. Concentration profiles of all the species as well as their pure spectra were resolved and kinetic rate constants were estimated. The values of rate constants changed with pH and under alkaline conditions the degradation pathway and photoproducts also changed. These results were compared to those obtained by LC-MS analysis from drug photodegradation experiments. MS analysis allowed the identification of up to five species and showed the simultaneous presence of more than one acid-base equilibrium.
Simultaneous Determination of Metamizole, Thiamin and Pyridoxin Using UV-Spectroscopy in Combination with Multivariate Calibration

PubMed Central

Chotimah, Chusnul; Sudjadi; Riyanto, Sugeng; Rohman, Abdul

2015-01-01

Purpose: Analysis of drugs in multicomponent system officially is carried out using chromatographic technique, however, this technique is too laborious and involving sophisticated instrument. Therefore, UV-VIS spectrophotometry coupled with multivariate calibration of partial least square (PLS) for quantitative analysis of metamizole, thiamin and pyridoxin is developed in the presence of cyanocobalamine without any separation step. Methods: The calibration and validation samples are prepared. The calibration model is prepared by developing a series of sample mixture consisting these drugs in certain proportion. Cross validation of calibration sample using leave one out technique is used to identify the smaller set of components that provide the greatest predictive ability. The evaluation of calibration model was based on the coefficient of determination (R2) and root mean square error of calibration (RMSEC). Results: The results showed that the coefficient of determination (R2) for the relationship between actual values and predicted values for all studied drugs was higher than 0.99 indicating good accuracy. The RMSEC values obtained were relatively low, indicating good precision. The accuracy and presision results of developed method showed no significant difference compared to those obtained by official method of HPLC. Conclusion: The developed method (UV-VIS spectrophotometry in combination with PLS) was succesfully used for analysis of metamizole, thiamin and pyridoxin in tablet dosage form. PMID:26819934
Novel risk score of contrast-induced nephropathy after percutaneous coronary intervention.

PubMed

Ji, Ling; Su, XiaoFeng; Qin, Wei; Mi, XuHua; Liu, Fei; Tang, XiaoHong; Li, Zi; Yang, LiChuan

2015-08-01

Contrast-induced nephropathy (CIN) post-percutaneous coronary intervention (PCI) is a major cause of acute kidney injury. In this study, we established a comprehensive risk score model to assess risk of CIN after PCI procedure, which could be easily used in a clinical environment. A total of 805 PCI patients, divided into analysis cohort (70%) and validation cohort (30%), were enrolled retrospectively in this study. Risk factors for CIN were identified using univariate analysis and multivariate logistic regression in the analysis cohort. Risk score model was developed based on multiple regression coefficients. Sensitivity and specificity of the new risk score system was validated in the validation cohort. Comparisons between the new risk score model and previous reported models were applied. The incidence of post-PCI CIN in the analysis cohort (n = 565) was 12%. Considerably high CIN incidence (50%) was observed in patients with chronic kidney disease (CKD). Age >75, body mass index (BMI) >25, myoglobin level, cardiac function level, hypoalbuminaemia, history of chronic kidney disease (CKD), Intra-aortic balloon pump (IABP) and peripheral vascular disease (PVD) were identified as independent risk factors of post-PCI CIN. A novel risk score model was established using multivariate regression coefficients, which showed highest sensitivity and specificity (0.917, 95%CI 0.877-0.957) compared with previous models. A new post-PCI CIN risk score model was developed based on a retrospective study of 805 patients. Application of this model might be helpful to predict CIN in patients undergoing PCI procedure. © 2015 Asian Pacific Society of Nephrology.
Investigating Causality Between Interacting Brain Areas with Multivariate Autoregressive Models of MEG Sensor Data

PubMed Central

Michalareas, George; Schoffelen, Jan-Mathijs; Paterson, Gavin; Gross, Joachim

2013-01-01

Abstract In this work, we investigate the feasibility to estimating causal interactions between brain regions based on multivariate autoregressive models (MAR models) fitted to magnetoencephalographic (MEG) sensor measurements. We first demonstrate the theoretical feasibility of estimating source level causal interactions after projection of the sensor-level model coefficients onto the locations of the neural sources. Next, we show with simulated MEG data that causality, as measured by partial directed coherence (PDC), can be correctly reconstructed if the locations of the interacting brain areas are known. We further demonstrate, if a very large number of brain voxels is considered as potential activation sources, that PDC as a measure to reconstruct causal interactions is less accurate. In such case the MAR model coefficients alone contain meaningful causality information. The proposed method overcomes the problems of model nonrobustness and large computation times encountered during causality analysis by existing methods. These methods first project MEG sensor time-series onto a large number of brain locations after which the MAR model is built on this large number of source-level time-series. Instead, through this work, we demonstrate that by building the MAR model on the sensor-level and then projecting only the MAR coefficients in source space, the true casual pathways are recovered even when a very large number of locations are considered as sources. The main contribution of this work is that by this methodology entire brain causality maps can be efficiently derived without any a priori selection of regions of interest. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc. PMID:22328419
Remote sensing of soil organic matter of farmland with hyperspectral image

NASA Astrophysics Data System (ADS)

Gu, Xiaohe; Wang, Lei; Yang, Guijun; Zhang, Liyan

2017-10-01

Monitoring soil organic matter (SOM) of cultivated land quantitively and mastering its spatial change are helpful for fertility adjustment and sustainable development of agriculture. The study aimed to analyze the response between SOM and reflectivity of hyperspectral image with different pixel size and develop the optimal model of estimating SOM with imaging spectral technology. The wavelet transform method was used to analyze the correlation between the hyperspectral reflectivity and SOM. Then the optimal pixel size and sensitive wavelet feature scale were screened to develop the inversion model of SOM. Result showed that wavelet transform of soil hyperspectrum was help to improve the correlation between the wavelet features and SOM. In the visible wavelength range, the susceptible wavelet features of SOM mainly concentrated 460 603 nm. As the wavelength increased, the wavelet scale corresponding correlation coefficient increased maximum and then gradually decreased. In the near infrared wavelength range, the susceptible wavelet features of SOM mainly concentrated 762 882 nm. As the wavelength increased, the wavelet scale gradually decreased. The study developed multivariate model of continuous wavelet transforms by the method of stepwise linear regression (SLR). The CWT-SLR models reached higher accuracies than those of univariate models. With the resampling scale increasing, the accuracies of CWT-SLR models gradually increased, while the determination coefficients (R2) fluctuated from 0.52 to 0.59. The R2 of 5*5 scale reached highest (0.5954), while the RMSE reached lowest (2.41 g/kg). It indicated that multivariate model based on continuous wavelet transform had better ability for estimating SOM than univariate model.
Baldovin-Stella stochastic volatility process and Wiener process mixtures

NASA Astrophysics Data System (ADS)

Peirano, P. P.; Challet, D.

2012-08-01

Starting from inhomogeneous time scaling and linear decorrelation between successive price returns, Baldovin and Stella recently proposed a powerful and consistent way to build a model describing the time evolution of a financial index. We first make it fully explicit by using Student distributions instead of power law-truncated Lévy distributions and show that the analytic tractability of the model extends to the larger class of symmetric generalized hyperbolic distributions and provide a full computation of their multivariate characteristic functions; more generally, we show that the stochastic processes arising in this framework are representable as mixtures of Wiener processes. The basic Baldovin and Stella model, while mimicking well volatility relaxation phenomena such as the Omori law, fails to reproduce other stylized facts such as the leverage effect or some time reversal asymmetries. We discuss how to modify the dynamics of this process in order to reproduce real data more accurately.
An effective drift correction for dynamical downscaling of decadal global climate predictions

NASA Astrophysics Data System (ADS)

Paeth, Heiko; Li, Jingmin; Pollinger, Felix; Müller, Wolfgang A.; Pohlmann, Holger; Feldmann, Hendrik; Panitz, Hans-Jürgen

2018-04-01

Initialized decadal climate predictions with coupled climate models are often marked by substantial climate drifts that emanate from a mismatch between the climatology of the coupled model system and the data set used for initialization. While such drifts may be easily removed from the prediction system when analyzing individual variables, a major problem prevails for multivariate issues and, especially, when the output of the global prediction system shall be used for dynamical downscaling. In this study, we present a statistical approach to remove climate drifts in a multivariate context and demonstrate the effect of this drift correction on regional climate model simulations over the Euro-Atlantic sector. The statistical approach is based on an empirical orthogonal function (EOF) analysis adapted to a very large data matrix. The climate drift emerges as a dramatic cooling trend in North Atlantic sea surface temperatures (SSTs) and is captured by the leading EOF of the multivariate output from the global prediction system, accounting for 7.7% of total variability. The SST cooling pattern also imposes drifts in various atmospheric variables and levels. The removal of the first EOF effectuates the drift correction while retaining other components of intra-annual, inter-annual and decadal variability. In the regional climate model, the multivariate drift correction of the input data removes the cooling trends in most western European land regions and systematically reduces the discrepancy between the output of the regional climate model and observational data. In contrast, removing the drift only in the SST field from the global model has hardly any positive effect on the regional climate model.
Multivariate multiscale entropy of financial markets

NASA Astrophysics Data System (ADS)

Lu, Yunfan; Wang, Jun

2017-11-01

In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.
Decoding Spontaneous Emotional States in the Human Brain

PubMed Central

Kragel, Philip A.; Knodt, Annchen R.; Hariri, Ahmad R.; LaBar, Kevin S.

2016-01-01

Pattern classification of human brain activity provides unique insight into the neural underpinnings of diverse mental states. These multivariate tools have recently been used within the field of affective neuroscience to classify distributed patterns of brain activation evoked during emotion induction procedures. Here we assess whether neural models developed to discriminate among distinct emotion categories exhibit predictive validity in the absence of exteroceptive emotional stimulation. In two experiments, we show that spontaneous fluctuations in human resting-state brain activity can be decoded into categories of experience delineating unique emotional states that exhibit spatiotemporal coherence, covary with individual differences in mood and personality traits, and predict on-line, self-reported feelings. These findings validate objective, brain-based models of emotion and show how emotional states dynamically emerge from the activity of separable neural systems. PMID:27627738

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sample records for multivariate models showed

Abstract